Characters or Morphemes: How to Represent Words?

Ustun, Ahmet; Kurfali, Murathan; Can, Burcu

doi:10.81043/aperta.36993

Yayınlanmış 1 Ocak 2018 | Sürüm v1

Konferans bildirisi Açık

Characters or Morphemes: How to Represent Words?

1. Middle East Tech Univ, Cognit Sci Dept, Informat Inst, Ankara, Turkey
2. Hacettepe Univ, Dept Comp Engn, Ankara, Turkey

In this paper, we investigate the effects of using subword information in representation learning. We argue that using syntactic subword units effects the quality of the word representations positively. We introduce a morpheme-based model and compare it against to word-based, characterbased, and character n-gram level models. Our model takes a list of candidate segmentations of a word and learns the representation of the word based on different segmentations that are weighted by an attention mechanism. We performed experiments on Turkish as a morphologically rich language and English with a comparably poorer morphology. The results show that morpheme-based models are better at learning word representations of morphologically complex languages compared to character-based and character ngram level models since the morphemes help to incorporate more syntactic knowledge in learning, that makes morphemebased models better at syntactic tasks.

Dosyalar

bib-db77027f-3048-428a-89e9-05250815ed34.txt

Dosyalar (122 Bytes)

Ad	Boyut	Hepisini indir
bib-db77027f-3048-428a-89e9-05250815ed34.txt md5:20b413609bb9bd24a552bf616eae9a3c	122 Bytes	Ön İzleme İndir

	Tüm sürümler	Bu sürüm
Görüntüleme	39	39
İndirilenler	10	10
Veri miktarı	1.2 kB	1.2 kB

Characters or Morphemes: How to Represent Words?

Dosyalar

bib-db77027f-3048-428a-89e9-05250815ed34.txt

Dosyalar (122 Bytes)

TÜBİTAK ULAKBİM

İLETİŞİM

Characters or Morphemes: How to Represent Words?

Oluşturanlar

Açıklama

Dosyalar

bib-db77027f-3048-428a-89e9-05250815ed34.txt

Dosyalar (122 Bytes)