Dergi makalesi Açık Erişim

I/O-efficient data structures for non-overlapping indexing

Hooshmand, Sahar; Abedin, Paniz; Kulekci, M. Oguzhan; Thankachan, Sharma V.


MARC21 XML

<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="http://www.loc.gov/MARC21/slim">
  <leader>00000nam##2200000uu#4500</leader>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="a">Abedin, Paniz</subfield>
    <subfield code="u">Univ Cent Florida, Dept Comp Sci, Orlando, FL 32816 USA</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="a">Kulekci, M. Oguzhan</subfield>
    <subfield code="u">Istanbul Tech Univ, Informat Inst, Istanbul, Turkey</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="a">Thankachan, Sharma V.</subfield>
    <subfield code="u">Univ Cent Florida, Dept Comp Sci, Orlando, FL 32816 USA</subfield>
  </datafield>
  <datafield tag="909" ind1="C" ind2="4">
    <subfield code="c">1-7</subfield>
    <subfield code="p">THEORETICAL COMPUTER SCIENCE</subfield>
    <subfield code="v">857</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">user-tubitak-destekli-proje-yayinlari</subfield>
  </datafield>
  <datafield tag="540" ind1=" " ind2=" ">
    <subfield code="a">Creative Commons Attribution</subfield>
    <subfield code="u">http://www.opendefinition.org/licenses/cc-by</subfield>
  </datafield>
  <datafield tag="024" ind1=" " ind2=" ">
    <subfield code="a">10.1016/j.tcs.2020.12.006</subfield>
    <subfield code="2">doi</subfield>
  </datafield>
  <datafield tag="245" ind1=" " ind2=" ">
    <subfield code="a">I/O-efficient data structures for non-overlapping indexing</subfield>
  </datafield>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="a">Hooshmand, Sahar</subfield>
    <subfield code="u">Univ Cent Florida, Dept Comp Sci, Orlando, FL 32816 USA</subfield>
  </datafield>
  <datafield tag="909" ind1="C" ind2="O">
    <subfield code="o">oai:aperta.ulakbim.gov.tr:237248</subfield>
    <subfield code="p">user-tubitak-destekli-proje-yayinlari</subfield>
  </datafield>
  <datafield tag="650" ind1="1" ind2="7">
    <subfield code="2">opendefinition.org</subfield>
    <subfield code="a">cc-by</subfield>
  </datafield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="c">2021-01-01</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="u">https://aperta.ulakbim.gov.trrecord/237248/files/bib-17de205a-494b-4846-891f-bfac34f3113b.txt</subfield>
    <subfield code="z">md5:92655b18e624bffd652c5c3c9a1eb3ad</subfield>
    <subfield code="s">169</subfield>
  </datafield>
  <datafield tag="542" ind1=" " ind2=" ">
    <subfield code="l">open</subfield>
  </datafield>
  <controlfield tag="005">20221007095137.0</controlfield>
  <controlfield tag="001">237248</controlfield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">publication</subfield>
    <subfield code="b">article</subfield>
  </datafield>
  <datafield tag="520" ind1=" " ind2=" ">
    <subfield code="a">The non-overlapping indexing problem is defined as follows: pre-process a given text T[1, n] of length n into a data structure such that whenever a pattern P [1, m] comes as an input, we can efficiently report the largest set of non-overlapping occurrences of P in T. The best-known solution is by Cohen and Porat [ISAAC 2009]. The size of their structure is O (n) words and the query time is optimal O (m + nocc), where nocc is the output size. Later, Ganguly et al. [CPM 2015 and Algorithmica 2020] proposed a compressed space solution. We study this problem in the cache-oblivious model and present a new data structure of size O (n log n) words. It can answer queries in optimal O (m/B + log(B) n + nocc/B) I/O operations, where B is the block size. The space can be improved to O (n log(M/B) n) in the cache-aware model, where M is the size of main memory. Additionally, we study a generalization of this problem with an additional range [s, e] constraint. Here the task is to report the largest set of non-overlapping occurrences of P in T, that are within the range [s, e]. We present an O (n log(2) n) space data structure in the cache-aware model that can answer queries in optimal O (m/B + log(B) n + nocc([s,e]) B ) I/O operations, where nocc([s,e]) is the output size.</subfield>
  </datafield>
</record>
8
3
görüntülenme
indirilme
Görüntülenme 8
İndirme 3
Veri hacmi 507 Bytes
Tekil görüntülenme 8
Tekil indirme 3

Alıntı yap