Konferans bildirisi Açık Erişim

Document Embedding based Supervised Methods for Turkish Text Classification

Celenli, Halil I.; Ozturk, S. Talha; Sahin, Gurkan; Gerek, Aydin; Ganiz, Murat C.


MARC21 XML

<?xml version='1.0' encoding='UTF-8'?>
<record xmlns="http://www.loc.gov/MARC21/slim">
  <leader>00000nam##2200000uu#4500</leader>
  <controlfield tag="001">36969</controlfield>
  <datafield tag="540" ind1=" " ind2=" ">
    <subfield code="a">Creative Commons Attribution</subfield>
    <subfield code="u">http://www.opendefinition.org/licenses/cc-by</subfield>
  </datafield>
  <datafield tag="773" ind1=" " ind2=" ">
    <subfield code="i">isVersionOf</subfield>
    <subfield code="a">10.81043/aperta.36968</subfield>
    <subfield code="n">doi</subfield>
  </datafield>
  <controlfield tag="005">20210315194258.0</controlfield>
  <datafield tag="520" ind1=" " ind2=" ">
    <subfield code="a">Following the recent increase in the amount of available data, Deep Learning has become the most popular branch of Machine Learning. This trend can also be seen in Natural Language Processing (NLP) especially since textual data can now be scraped from in World Wide Web in vast quantities and used in an unsupervised or semi-supervised manner. For this reason, Deep Learning methods are being used more frequently. In this work we devise several classification methods based on the Paragraph Vector model (a.k.a. Doc2Vec) which represents documents as vectors. These include k-Nearest Neighborhood classifier (k-NN), Support Vector Machines (SVM), Centroid Classifier (CC) that works on paragraph vectors of documents and a custom made method which uses pairwise cosine similarities between documents and class centroids as features in Doc2Vec space. Our experiments use a number of representations and classifiers combined in various ways. On the representation side the Paragraph Vector model is compared with Term Frequency (tf) and Term Frequency-Inverse Document Frequency (tf-idf) using SVM, k-NN, CC and Centroid Features Support Vector Machine (CFSVM) as classifiers.</subfield>
  </datafield>
  <datafield tag="245" ind1=" " ind2=" ">
    <subfield code="a">Document Embedding based Supervised Methods for Turkish Text Classification</subfield>
  </datafield>
  <datafield tag="909" ind1="C" ind2="O">
    <subfield code="p">user-tubitak-destekli-proje-yayinlari</subfield>
    <subfield code="o">oai:zenodo.org:36969</subfield>
  </datafield>
  <datafield tag="542" ind1=" " ind2=" ">
    <subfield code="l">open</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">user-tubitak-destekli-proje-yayinlari</subfield>
  </datafield>
  <datafield tag="856" ind1="4" ind2=" ">
    <subfield code="z">md5:50c570bfc2bb4d9a82df7f8dbb1457b8</subfield>
    <subfield code="s">219</subfield>
    <subfield code="u">https://aperta.ulakbim.gov.trrecord/36969/files/bib-7a7a136c-f514-49ce-bd32-3e629a5722e6.txt</subfield>
  </datafield>
  <datafield tag="024" ind1=" " ind2=" ">
    <subfield code="2">doi</subfield>
    <subfield code="a">10.81043/aperta.36969</subfield>
  </datafield>
  <datafield tag="711" ind1=" " ind2=" ">
    <subfield code="a">2018 3RD INTERNATIONAL CONFERENCE ON COMPUTER SCIENCE AND ENGINEERING (UBMK)</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Marmara Univ, Muhendislik Fak, Bilgisayar Muhendisligi, Istanbul, Turkey</subfield>
    <subfield code="a">Ozturk, S. Talha</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="a">Sahin, Gurkan</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Marmara Univ, Muhendislik Fak, Bilgisayar Muhendisligi, Istanbul, Turkey</subfield>
    <subfield code="a">Gerek, Aydin</subfield>
  </datafield>
  <datafield tag="700" ind1=" " ind2=" ">
    <subfield code="u">Marmara Univ, Muhendislik Fak, Bilgisayar Muhendisligi, Istanbul, Turkey</subfield>
    <subfield code="a">Ganiz, Murat C.</subfield>
  </datafield>
  <datafield tag="650" ind1="1" ind2="7">
    <subfield code="a">cc-by</subfield>
    <subfield code="2">opendefinition.org</subfield>
  </datafield>
  <datafield tag="980" ind1=" " ind2=" ">
    <subfield code="a">publication</subfield>
    <subfield code="b">conferencepaper</subfield>
  </datafield>
  <datafield tag="260" ind1=" " ind2=" ">
    <subfield code="c">2018-01-01</subfield>
  </datafield>
  <datafield tag="100" ind1=" " ind2=" ">
    <subfield code="a">Celenli, Halil I.</subfield>
  </datafield>
</record>
19
7
görüntülenme
indirilme
Görüntülenme 19
İndirme 7
Veri hacmi 1.5 kB
Tekil görüntülenme 16
Tekil indirme 7

Alıntı yap