Konferans bildirisi Açık Erişim

An empirical analysis of pattern scan order in pattern matching

   Kuelekci, A. Oguzhan

In pattern matching, scanning a given pattern in a particular order greatly influences the performance. This study investigates the effect of different pattern scan orders on natural language text and on DNA sequence data. Besides the well-known right-to-left ordering of Boyer-Moore, and from the least frequent character to most frequent one of Sunday's optimal mismatch algorithm, four alternative character search sequence orderings based on newly introduced distant n-gram statistics are proposed within this work. In all experiments, Sunday's pattern matching algorithm, where the characters of a given pattern can be scanned in any order, is used as the main framework. On natural language test data, the alternative pattern scan orders give better results in 60% of the test keywords. On genome data best ordering among the tested six approaches is the right-to-left order.

Dosyalar (134 Bytes)
Dosya adı Boyutu
bib-346eb5ac-de32-454b-a6b2-41e6e879e767.txt
md5:c2f76fc654583b511289cdeef4602315
134 Bytes İndir
119
18
görüntülenme
indirilme
Görüntülenme 119
İndirme 18
Veri hacmi 2.4 kB
Tekil görüntülenme 104
Tekil indirme 18

Alıntı yap