Konferans bildirisi Açık Erişim
Sak, Hasim; Guengor, Tunga; Saraclar, Murat
<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="http://www.loc.gov/MARC21/slim">
<leader>00000nam##2200000uu#4500</leader>
<datafield tag="909" ind1="C" ind2="O">
<subfield code="p">user-tubitak-destekli-proje-yayinlari</subfield>
<subfield code="o">oai:zenodo.org:39999</subfield>
</datafield>
<datafield tag="520" ind1=" " ind2=" ">
<subfield code="a">In this paper, we propose a set of language resources for building Turkish language processing applications. Specifically, we present a finite-state implementation of a morphological parser, an averaged perceptron-based morphological disambiguator, and compilation of a web corpus. Turkish is an agglutinative language with a highly productive inflectional and derivational morphology, We present an implementation of a morphological parser based on two-level morphology. This parser is one of the most complete parsers for Turkish and it runs independent of any other external system such as PC-KIMMO in contrast to existing parsers. Due to complex phonology and morphology of Turkish, parsing introduces some ambiguous parses. We developed a morphological disambiguator with accuracy of about 98% using averaged perceptron algorithm. We also present our efforts to build a Turkish web corpus of about 423 million words.</subfield>
</datafield>
<datafield tag="980" ind1=" " ind2=" ">
<subfield code="a">publication</subfield>
<subfield code="b">conferencepaper</subfield>
</datafield>
<datafield tag="711" ind1=" " ind2=" ">
<subfield code="a">ADVANCES IN NATURAL LANGUAGE PROCESSING, PROCEEDINGS</subfield>
</datafield>
<datafield tag="540" ind1=" " ind2=" ">
<subfield code="a">Creative Commons Attribution</subfield>
<subfield code="u">http://www.opendefinition.org/licenses/cc-by</subfield>
</datafield>
<datafield tag="773" ind1=" " ind2=" ">
<subfield code="i">isVersionOf</subfield>
<subfield code="a">10.81043/aperta.39998</subfield>
<subfield code="n">doi</subfield>
</datafield>
<datafield tag="100" ind1=" " ind2=" ">
<subfield code="a">Sak, Hasim</subfield>
<subfield code="u">Bogazici Univ, Dept Comp Engn, TR-34342 Istanbul, Turkey</subfield>
</datafield>
<datafield tag="856" ind1="4" ind2=" ">
<subfield code="z">md5:7d02e30113cae26e2b948abd8d530080</subfield>
<subfield code="s">190</subfield>
<subfield code="u">https://aperta.ulakbim.gov.trrecord/39999/files/bib-29014de1-4d5e-4c26-9725-4c7760af0139.txt</subfield>
</datafield>
<controlfield tag="005">20210315202247.0</controlfield>
<datafield tag="260" ind1=" " ind2=" ">
<subfield code="c">2008-01-01</subfield>
</datafield>
<datafield tag="024" ind1=" " ind2=" ">
<subfield code="a">10.81043/aperta.39999</subfield>
<subfield code="2">doi</subfield>
</datafield>
<datafield tag="542" ind1=" " ind2=" ">
<subfield code="l">open</subfield>
</datafield>
<datafield tag="245" ind1=" " ind2=" ">
<subfield code="a">Turkish language resources: Morphological parser, morphological disambiguator and web corpus</subfield>
</datafield>
<datafield tag="650" ind1="1" ind2="7">
<subfield code="a">cc-by</subfield>
<subfield code="2">opendefinition.org</subfield>
</datafield>
<datafield tag="700" ind1=" " ind2=" ">
<subfield code="a">Guengor, Tunga</subfield>
<subfield code="u">Bogazici Univ, Dept Comp Engn, TR-34342 Istanbul, Turkey</subfield>
</datafield>
<datafield tag="700" ind1=" " ind2=" ">
<subfield code="a">Saraclar, Murat</subfield>
<subfield code="u">Bogazici Univ, Elect & Elect Engn Dept, TR-34342 Bebek, Turkey</subfield>
</datafield>
<controlfield tag="001">39999</controlfield>
<datafield tag="980" ind1=" " ind2=" ">
<subfield code="a">user-tubitak-destekli-proje-yayinlari</subfield>
</datafield>
</record>
| Görüntülenme | 140 |
| İndirme | 22 |
| Veri hacmi | 4.2 kB |
| Tekil görüntülenme | 129 |
| Tekil indirme | 21 |