Dergi makalesi Açık Erişim

Named-entity recognition in Turkish legal texts

Cetindag, Can; Yazicioglu, Berkay; Koc, Aykut


MARC21 XML

<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="http://www.loc.gov/MARC21/slim">
  <leader>00000nam##2200000uu#4500</leader>
  <datafield tag="909" ind1="C" ind2="O">
    <subfield code="p">user-tubitak-destekli-proje-yayinlari</subfield>
    <subfield code="o">oai:aperta.ulakbim.gov.tr:254559</subfield>
  </datafield>
  <datafield tag="520" ind1=" " ind2=" ">
    <subfield code="a">Natural language processing (NLP) technologies and applications in legal text processing are gaining momentum. Being one of the most prominent tasks in NLP, named-entity recognition (NER) can substantiate a great convenience for NLP in law due to the variety of named entities in the legal domain and their accentuated importance in legal documents. However, domain-specific NER models in the legal domain are not well studied. We present a NER model for Turkish legal texts with a custom-made corpus as well as several NER architectures based on conditional random fields and bidirectional long-short-term memories (BiLSTMs) to address the task. We also study several combinations of different word embeddings consisting of GloVe, Morph2Vec, and neural network-based character feature extraction techniques either with BiLSTM or convolutional neural networks. We report 92.27% F1 score with a hybrid word representation of GloVe and Morph2Vec with character-level features extracted with BiLSTM. Being an agglutinative language, the morphological structure of Turkish is also considered. To the best of our knowledge, our work is the first legal domain-specific NER study in Turkish and also the first study for an agglutinative language in the legal domain. Thus, our work can also have implications beyond the Turkish language.</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">publication</subfield>
    <subfield code="b">article</subfield>
  </datafield>
  <datafield tag="540" ind1=" " ind2=" ">
    <subfield code="a">Creative Commons Attribution</subfield>
    <subfield code="u">http://www.opendefinition.org/licenses/cc-by</subfield>
  </datafield>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="a">Cetindag, Can</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="z">md5:30ee6c1a8a142b00509ce1dc0950c61b</subfield>
    <subfield code="s">141</subfield>
    <subfield code="u">https://aperta.ulakbim.gov.trrecord/254559/files/bib-8a4a0baa-462c-4ba8-8497-b4a5d8e38edc.txt</subfield>
  </datafield>
  <controlfield tag="005">20230728211836.0</controlfield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="c">2023-01-01</subfield>
  </datafield>
  <datafield tag="024" ind1=" " ind2=" ">
    <subfield code="a">10.1017/S1351324922000304</subfield>
    <subfield code="2">doi</subfield>
  </datafield>
  <datafield tag="542" ind1=" " ind2=" ">
    <subfield code="l">open</subfield>
  </datafield>
  <datafield tag="245" ind1=" " ind2=" ">
    <subfield code="a">Named-entity recognition in Turkish legal texts</subfield>
  </datafield>
  <datafield tag="909" ind1="C" ind2="4">
    <subfield code="v">29</subfield>
    <subfield code="p">NATURAL LANGUAGE ENGINEERING</subfield>
    <subfield code="c">615-642</subfield>
    <subfield code="n">3</subfield>
  </datafield>
  <datafield tag="650" ind1="1" ind2="7">
    <subfield code="a">cc-by</subfield>
    <subfield code="2">opendefinition.org</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="a">Yazicioglu, Berkay</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="a">Koc, Aykut</subfield>
  </datafield>
  <controlfield tag="001">254559</controlfield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">user-tubitak-destekli-proje-yayinlari</subfield>
  </datafield>
</record>
28
8
görüntülenme
indirilme
Görüntülenme 28
İndirme 8
Veri hacmi 1.1 kB
Tekil görüntülenme 27
Tekil indirme 8

Alıntı yap