Enlarging multiword expression dataset by co-training

Kumova Metin, Senem

doi:10.3906/elk-1709-185

Yayınlanmış 1 Ocak 2018 | Sürüm v1

Dergi makalesi Açık

Enlarging multiword expression dataset by co-training

Kumova Metin, Senem¹

1. Izmir Univ Econ, Fac Engn, Dept Software Engn, Izmir, Turkey

In multiword expressions (MWEs), multiple words unite to build a new unit in language. When MWE identification is accepted as a binary classification task, one of the most important factors in performance is to train the classifier with enough number of labelled samples. Since manual labelling is a time-consuming task, the performances of MWE recognition studies are limited with the size of the training sets. In this study, we propose the comparison-based and common-decision co-training approaches in order to enlarge the MWE dataset. In the experiments, the performances of the proposed approaches were compared to those of the standard co-training [1] and manual labelling where statistical and linguistic features are employed as two different views of the MWE dataset [2]. A number of tests with different settings were performed on a Turkish MWE dataset. Ten different classifiers were utilized in the experiments and the best performing classifier pair was observed to be the SMO-SMO pair. The experimental results showed that the common-decision co-training approach is an alternative to hand-labeling of large MWE datasets and both newly proposed approaches outperform the standard co-training [2] when the training set is to be enlarged in MWE classification.

Dosyalar

10-3906-elk-1709-185.pdf

Dosyalar (445.3 kB)

Ad	Boyut	Hepisini indir
10-3906-elk-1709-185.pdf md5:aa5a485851d74dcb1db36dd45abddb87	445.3 kB	Ön İzleme İndir

	Tüm sürümler	Bu sürüm
Görüntüleme	90	90
İndirilenler	46	46
Veri miktarı	20.5 MB	20.5 MB

Enlarging multiword expression dataset by co-training

Dosyalar

10-3906-elk-1709-185.pdf

Dosyalar (445.3 kB)

TÜBİTAK ULAKBİM

İLETİŞİM

Enlarging multiword expression dataset by co-training

Oluşturanlar

Açıklama

Dosyalar

10-3906-elk-1709-185.pdf

Dosyalar (445.3 kB)