Description of Turkish Paraphrase Corpus Structure and Generation Method

Karaoglan, Bahar; Kisla, Tarik; Metin, Senem Kumova

doi:10.1007/978-3-319-75477-2_13

1 Ocak 2018 Konferans bildirisi Açık Erişim

Description of Turkish Paraphrase Corpus Structure and Generation Method

Karaoglan, Bahar; Kisla, Tarik; Metin, Senem Kumova

DataCite XML

<?xml version='1.0' encoding='utf-8'?>
<resource xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://datacite.org/schema/kernel-4" xsi:schemaLocation="http://datacite.org/schema/kernel-4 http://schema.datacite.org/meta/kernel-4.1/metadata.xsd">
  <identifier identifierType="URL">https://aperta.ulakbim.gov.tr/record/33379</identifier>
  <creators>
    <creator>
      <creatorName>Karaoglan, Bahar</creatorName>
      <givenName>Bahar</givenName>
      <familyName>Karaoglan</familyName>
      <affiliation>Ege Univ, Izmir, Turkey</affiliation>
    </creator>
    <creator>
      <creatorName>Kisla, Tarik</creatorName>
      <givenName>Tarik</givenName>
      <familyName>Kisla</familyName>
      <affiliation>Ege Univ, Izmir, Turkey</affiliation>
    </creator>
    <creator>
      <creatorName>Metin, Senem Kumova</creatorName>
      <givenName>Senem Kumova</givenName>
      <familyName>Metin</familyName>
      <affiliation>Izmir Univ Econ, Izmir, Turkey</affiliation>
    </creator>
  </creators>
  <titles>
    <title>Description Of Turkish Paraphrase Corpus Structure And Generation Method</title>
  </titles>
  <publisher>Aperta</publisher>
  <publicationYear>2018</publicationYear>
  <dates>
    <date dateType="Issued">2018-01-01</date>
  </dates>
  <resourceType resourceTypeGeneral="Text">Conference paper</resourceType>
  <alternateIdentifiers>
    <alternateIdentifier alternateIdentifierType="url">https://aperta.ulakbim.gov.tr/record/33379</alternateIdentifier>
  </alternateIdentifiers>
  <relatedIdentifiers>
    <relatedIdentifier relatedIdentifierType="DOI" relationType="IsIdenticalTo">10.1007/978-3-319-75477-2_13</relatedIdentifier>
  </relatedIdentifiers>
  <rightsList>
    <rights rightsURI="http://www.opendefinition.org/licenses/cc-by">Creative Commons Attribution</rights>
    <rights rightsURI="info:eu-repo/semantics/openAccess">Open Access</rights>
  </rightsList>
  <descriptions>
    <description descriptionType="Abstract">Because developing a corpus requires a long time and lots of human effort, it is desirable to make it as resourceful as possible: rich in coverage, flexible, multipurpose and expandable. Here we describe the steps we took in the development of Turkish paraphrase corpus, the factors we considered, problems we faced and how we dealt with them. Currently our corpus contains nearly 4000 sentences with the ratio of 60% paraphrase and 40% non-paraphrase sentence pairs. The sentence pairs are annotated at 5-scale: paraphrase, encapsulating, encapsulated, non-paraphrase and opposite. The corpus is formulated in a database structure integrated with Turkish dictionary. The sources we used till now are news texts from Bilcon 2005 corpus, a set of professionally translated sentence pairs from MSRP corpus, multiple Turkish translations from different languages that are involved in Tatoeba corpus and user generated paraphrases.</description>
  </descriptions>
</resource>

görüntülenme

indirilme

Daha fazla ayrıntı...

Görüntülenme	68
İndirme	17
Veri hacmi	3.4 kB
Tekil görüntülenme	64
Tekil indirme	17

Kayıt Bilgileri

Yayınlanma tarihi:: 01/01/2018
Konferans Bilgileri:: COMPUTATIONAL LINGUISTICS AND INTELLIGENT TEXT PROCESSING, (CICLING 2016), PT I
Lisans:: Creative Commons Attribution

Atıflar

Citation Indexes: 2

Okunma İstatistikleri

Readers: 2

Daha fazla bilgi

Description of Turkish Paraphrase Corpus Structure and Generation Method

DataCite XML

Kayıt Bilgileri

Alıntı yap

Paylaş

Dışa aktar

TÜBİTAK ULAKBİM

İLETİŞİM