Konferans bildirisi Açık Erişim

Turkish Treebanking: Unifying and Constructing Efforts

Turk, Utku; Atmaca, Furkan; Ozates, Saziye Betul; Koksal, Abdullatif; Ozturk, Balkiz; Gungor, Tunga; Ozgur, Arzucan


MARC21 XML

<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="http://www.loc.gov/MARC21/slim">
  <leader>00000nam##2200000uu#4500</leader>
  <datafield tag="245" ind1=" " ind2=" ">
    <subfield code="a">Turkish Treebanking: Unifying and Constructing Efforts</subfield>
  </datafield>
  <datafield tag="024" ind1=" " ind2=" ">
    <subfield code="a">10.81043/aperta.74453</subfield>
    <subfield code="2">doi</subfield>
  </datafield>
  <controlfield tag="001">74453</controlfield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">user-tubitak-destekli-proje-yayinlari</subfield>
  </datafield>
  <datafield tag="520" ind1=" " ind2=" ">
    <subfield code="a">In this paper, we present the re-annotation of the Turkish PUD Treebank and the first annotation of the Turkish National Corpus Universal Dependency (henceforth TNC-UD) Treebank as part of our efforts for unifying and extending the Turkish universal dependency treebanks. In accordance with the Universal Dependencies' guidelines and the necessities of Turkish grammar, both treebanks, the Turkish PUD Treebank and TNC-UD, were revised with regards to their syntactic relations. The TNC-UD is planned to have 10,000 sentences. In this paper, we present the first 500 sentences along with the re-annotation of the PUD Treebank. Moreover, this paper also offers the parsing results of a graph-based neural parser on the previous and re-annotated PUD, as well as the TNC-UD. In light of the comparisons, even though we observe a slight decrease in the attachment scores of the Turkish PUD treebank, we demonstrate that the annotation of the TNC-UD improves the parsing accuracy of Turkish. In addition to the treebanks, we have also constructed a custom annotation software with advanced filtering and morphological editing options. Both of the treebanks, including a full edit-history and the annotation guidelines, as well as the custom software are publicly available online under an open license.</subfield>
  </datafield>
  <datafield tag="650" ind1="1" ind2="7">
    <subfield code="2">opendefinition.org</subfield>
    <subfield code="a">cc-by</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Bogazici Univ, Dept Linguist, TR-34342 Istanbul, Turkey</subfield>
    <subfield code="a">Atmaca, Furkan</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Bogazici Univ, Dept Comp Engn, TR-34342 Istanbul, Turkey</subfield>
    <subfield code="a">Ozates, Saziye Betul</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Bogazici Univ, Dept Comp Engn, TR-34342 Istanbul, Turkey</subfield>
    <subfield code="a">Koksal, Abdullatif</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Bogazici Univ, Dept Linguist, TR-34342 Istanbul, Turkey</subfield>
    <subfield code="a">Ozturk, Balkiz</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Bogazici Univ, Dept Comp Engn, TR-34342 Istanbul, Turkey</subfield>
    <subfield code="a">Gungor, Tunga</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Bogazici Univ, Dept Comp Engn, TR-34342 Istanbul, Turkey</subfield>
    <subfield code="a">Ozgur, Arzucan</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="b">conferencepaper</subfield>
    <subfield code="a">publication</subfield>
  </datafield>
  <datafield tag="542" ind1=" " ind2=" ">
    <subfield code="l">open</subfield>
  </datafield>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="u">Bogazici Univ, Dept Linguist, TR-34342 Istanbul, Turkey</subfield>
    <subfield code="a">Turk, Utku</subfield>
  </datafield>
  <datafield tag="711" ind1=" " ind2=" ">
    <subfield code="a">13TH LINGUISTIC ANNOTATION WORKSHOP (LAW XIII)</subfield>
  </datafield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="c">2019-01-01</subfield>
  </datafield>
  <controlfield tag="005">20210316041148.0</controlfield>
  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="n">doi</subfield>
    <subfield code="a">10.81043/aperta.74452</subfield>
    <subfield code="i">isVersionOf</subfield>
  </datafield>
  <datafield tag="909" ind1="C" ind2="O">
    <subfield code="o">oai:zenodo.org:74453</subfield>
    <subfield code="p">user-tubitak-destekli-proje-yayinlari</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="z">md5:cdbe717ff5622c42e5509246bb966d16</subfield>
    <subfield code="s">191</subfield>
    <subfield code="u">https://aperta.ulakbim.gov.trrecord/74453/files/bib-f7f27652-48c8-4814-bee2-d43cf2d7b74e.txt</subfield>
  </datafield>
  <datafield tag="540" ind1=" " ind2=" ">
    <subfield code="u">http://www.opendefinition.org/licenses/cc-by</subfield>
    <subfield code="a">Creative Commons Attribution</subfield>
  </datafield>
</record>
30
4
görüntülenme
indirilme
Görüntülenme 30
İndirme 4
Veri hacmi 764 Bytes
Tekil görüntülenme 29
Tekil indirme 4

Alıntı yap