Konferans bildirisi Açık Erişim

Alternative PPM Model for Quality Score Compression

Akgun, Mete; Sagiroglu, Mahmut Samil


MARC21 XML

<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="http://www.loc.gov/MARC21/slim">
  <leader>00000nam##2200000uu#4500</leader>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="a">Sagiroglu, Mahmut Samil</subfield>
    <subfield code="u">Tubitak BILGEM, TR-41470 Gebze, Kocaeli, Turkey</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">user-tubitak-adresli-yayinlar</subfield>
  </datafield>
  <datafield tag="540" ind1=" " ind2=" ">
    <subfield code="a">Creative Commons Attribution</subfield>
    <subfield code="u">http://www.opendefinition.org/licenses/cc-by</subfield>
  </datafield>
  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="i">isVersionOf</subfield>
    <subfield code="a">10.81043/aperta.91660</subfield>
    <subfield code="n">doi</subfield>
  </datafield>
  <datafield tag="024" ind1=" " ind2=" ">
    <subfield code="a">10.81043/aperta.91661</subfield>
    <subfield code="2">doi</subfield>
  </datafield>
  <datafield tag="245" ind1=" " ind2=" ">
    <subfield code="a">Alternative PPM Model for Quality Score Compression</subfield>
  </datafield>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="a">Akgun, Mete</subfield>
    <subfield code="u">Tubitak BILGEM, TR-41470 Gebze, Kocaeli, Turkey</subfield>
  </datafield>
  <datafield tag="909" ind1="C" ind2="O">
    <subfield code="o">oai:zenodo.org:91661</subfield>
    <subfield code="p">user-tubitak-adresli-yayinlar</subfield>
  </datafield>
  <datafield tag="650" ind1="1" ind2="7">
    <subfield code="2">opendefinition.org</subfield>
    <subfield code="a">cc-by</subfield>
  </datafield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="c">2013-01-01</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="u">https://aperta.ulakbim.gov.trrecord/91661/files/bib-deecf37f-8366-41bc-b9ae-29f23e14feb0.txt</subfield>
    <subfield code="z">md5:9b6d120de22dda2a93124225e58898bd</subfield>
    <subfield code="s">200</subfield>
  </datafield>
  <datafield tag="542" ind1=" " ind2=" ">
    <subfield code="l">open</subfield>
  </datafield>
  <controlfield tag="005">20210316122420.0</controlfield>
  <controlfield tag="001">91661</controlfield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">publication</subfield>
    <subfield code="b">conferencepaper</subfield>
  </datafield>
  <datafield tag="711" ind1=" " ind2=" ">
    <subfield code="a">BIOINFORMATICS 2013: PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON BIOINFORMATICS MODELS, METHODS AND ALGORITHMS</subfield>
  </datafield>
  <datafield tag="520" ind1=" " ind2=" ">
    <subfield code="a">Next Generation Sequencing (NGS) platforms generate header data and quality information for each nucleotide sequence. These platforms may produce gigabyte-scale datasets. The storage of these datasets is one of the major bottlenecks of NGS technology. Information produced by NGS are stored in FASTQ format. In this paper, we propose an algorithm to compress quality score information stored in a FASTQ file. We try to find a model that gives the lowest entropy on quality score data. We combine our powerful statistical model with arithmetic coding to compress the quality score data the smallest. We compare its performance to text compression utilities such as bzip2, gzip and ppmd and existing compression algorithms for quality scores. We show that the performance of our compression algorithm is superior to that of both systems.</subfield>
  </datafield>
</record>
26
5
görüntülenme
indirilme
Görüntülenme 26
İndirme 5
Veri hacmi 1.0 kB
Tekil görüntülenme 22
Tekil indirme 5

Alıntı yap