An empirical analysis of pattern scan order in pattern matching

Kuelekci, A. Oguzhan

doi:10.48623/aperta.97839

Published January 1, 2007 | Version v1

Conference paper Open

An empirical analysis of pattern scan order in pattern matching

Kuelekci, A. Oguzhan

In pattern matching, scanning a given pattern in a particular order greatly influences the performance. This study investigates the effect of different pattern scan orders on natural language text and on DNA sequence data. Besides the well-known right-to-left ordering of Boyer-Moore, and from the least frequent character to most frequent one of Sunday's optimal mismatch algorithm, four alternative character search sequence orderings based on newly introduced distant n-gram statistics are proposed within this work. In all experiments, Sunday's pattern matching algorithm, where the characters of a given pattern can be scanned in any order, is used as the main framework. On natural language test data, the alternative pattern scan orders give better results in 60% of the test keywords. On genome data best ordering among the tested six approaches is the right-to-left order.

Files

bib-346eb5ac-de32-454b-a6b2-41e6e879e767.txt

Files (134 Bytes)

Name	Size	Download all
bib-346eb5ac-de32-454b-a6b2-41e6e879e767.txt md5:c2f76fc654583b511289cdeef4602315	134 Bytes	Preview Download

	All versions	This version
Views	295	295
Downloads	53	53
Data volume	7.2 kB	7.2 kB

An empirical analysis of pattern scan order in pattern matching

Files

bib-346eb5ac-de32-454b-a6b2-41e6e879e767.txt

Files (134 Bytes)

TÜBİTAK ULAKBİM

CONTACT

An empirical analysis of pattern scan order in pattern matching

Creators

Description

Files

bib-346eb5ac-de32-454b-a6b2-41e6e879e767.txt

Files (134 Bytes)