Dergi makalesi Açık Erişim

Synset expansion on translation graph for automatic wordnet construction

Ercan, Gonenc; Haziyev, Farid


MARC21 XML

<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="http://www.loc.gov/MARC21/slim">
  <leader>00000nam##2200000uu#4500</leader>
  <datafield tag="909" ind1="C" ind2="O">
    <subfield code="p">user-tubitak-destekli-proje-yayinlari</subfield>
    <subfield code="o">oai:zenodo.org:75037</subfield>
  </datafield>
  <datafield tag="520" ind1=" " ind2=" ">
    <subfield code="a">Research on clustering algorithms in synonymy graphs of a single language yields promising results, however, this idea is not yet explored in a multilingual setting. Nevertheless, moving the problem to a multilingual translation graph enables the use of more clues and techniques not possible in a monolingual synonymy graph. This article explores the potential of sense induction methods in a massively multilingual translation graph. For this purpose, the performance of graph clustering methods in synset detection are investigated. In the context of translation graphs, the use of existing Wordnets in different languages is an important clue for synset detection which cannot be utilized in a monolingual setting. Casting the problem into an unsupervised synset expansion task rather than a clustering or community detection task improves the results substantially. Furthermore, instead of a greedy unsupervised expansion algorithm guided by heuristics, we devise a supervised learning algorithm able to learn synset expansion patterns from the words in existing Wordnets to achieve superior results. As the training data is formed of already existing Wordnets, as opposed to previous work, manual labeling is not required. To evaluate our methods, Wordnets for Slovenian, Persian, German and Russian are built from scratch and compared to their manually built Wordnets or labeled test-sets. Results reveal a clear improvement over 2 state-of-the-art algorithms targeting massively multilingual Wordnets and competitive results with Wordnet construction methods targeting a single language. The system is able to produce Wordnets from scratch with a Wordnet base concept coverage ranging from 20% to 88% for 51 languages and expands existing Wordnets up to 30%.</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">publication</subfield>
    <subfield code="b">article</subfield>
  </datafield>
  <datafield tag="540" ind1=" " ind2=" ">
    <subfield code="a">Creative Commons Attribution</subfield>
    <subfield code="u">http://www.opendefinition.org/licenses/cc-by</subfield>
  </datafield>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="a">Ercan, Gonenc</subfield>
    <subfield code="u">Hacettepe Univ, Inst Informat, Ankara, Turkey</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="z">md5:56f7c0e05b86e5660a566517ebf33360</subfield>
    <subfield code="s">158</subfield>
    <subfield code="u">https://aperta.ulakbim.gov.trrecord/75037/files/bib-dd50c2a1-e8b4-4dec-a9ce-fba36e6df12b.txt</subfield>
  </datafield>
  <controlfield tag="005">20210316041940.0</controlfield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="c">2019-01-01</subfield>
  </datafield>
  <datafield tag="024" ind1=" " ind2=" ">
    <subfield code="a">10.1016/j.ipm.2018.10.002</subfield>
    <subfield code="2">doi</subfield>
  </datafield>
  <datafield tag="542" ind1=" " ind2=" ">
    <subfield code="l">open</subfield>
  </datafield>
  <datafield tag="245" ind1=" " ind2=" ">
    <subfield code="a">Synset expansion on translation graph for automatic wordnet construction</subfield>
  </datafield>
  <datafield tag="909" ind1="C" ind2="4">
    <subfield code="v">56</subfield>
    <subfield code="p">INFORMATION PROCESSING &amp; MANAGEMENT</subfield>
    <subfield code="c">130-150</subfield>
    <subfield code="n">1</subfield>
  </datafield>
  <datafield tag="650" ind1="1" ind2="7">
    <subfield code="a">cc-by</subfield>
    <subfield code="2">opendefinition.org</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="a">Haziyev, Farid</subfield>
    <subfield code="u">Hacettepe Univ, Dept Comp Engn, Beytepe Campus PO, TR-06800 Ankara, Turkey</subfield>
  </datafield>
  <controlfield tag="001">75037</controlfield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">user-tubitak-destekli-proje-yayinlari</subfield>
  </datafield>
</record>
37
8
görüntülenme
indirilme
Görüntülenme 37
İndirme 8
Veri hacmi 1.3 kB
Tekil görüntülenme 37
Tekil indirme 8

Alıntı yap