An Annotated Corpus for Turkish Sentiment Analysis at Sentence Level

Omurca, Sevinc Ihan; Ekinci, Ekin; Turkmen, Hazal

doi:10.48623/aperta.45795

Yayınlanmış 1 Ocak 2017 | Sürüm v1

Konferans bildirisi Açık

An Annotated Corpus for Turkish Sentiment Analysis at Sentence Level

1. Fac Engn, Dept Comp Engn, TR-41380 Kocaeli, Turkey

With the rapid growth of unstructured data accessible via web, managing these data and finding undiscovered information in huge dataset become a necessary task. Consequently text mining, which can be defined as gleaning important information from natural language text, has emerged. In this study, in order to facilitate information management for aspect based sentiment analysis studies, a Turkish sentiment corpus, which is comprised of user reviews and is annotated semi-automatically, is constructed. In the constructed corpus, the root form of the words, the usage (aspect/multiaspect/seedsentiment/absent) of these words, Part of Speech (POS) tags and their polarities are defined. Turkish hotel review dataset which contains 1000 reviews and 5364 sentences for this study was crawled from a web source. The system takes reviews, aspect and seedsentiment lists and returns JSON data structures of the annotated corpus. In this paper, both we provide a ready to use dataset for developing aspect based sentiment analysis applications and we make this dataset easy to use for Java applications by creating JSON data.

Dosyalar

bib-9896bc69-9ed5-4227-a523-fce9135f50c8.txt

Dosyalar (194 Bytes)

Ad	Boyut	Hepisini indir
bib-9896bc69-9ed5-4227-a523-fce9135f50c8.txt md5:440e8b6e955c5e41df992a81a3d5ccaf	194 Bytes	Ön İzleme İndir

	Tüm sürümler	Bu sürüm
Görüntüleme	109	109
İndirilenler	29	29
Veri miktarı	5.6 kB	5.6 kB

An Annotated Corpus for Turkish Sentiment Analysis at Sentence Level

Dosyalar

bib-9896bc69-9ed5-4227-a523-fce9135f50c8.txt

Dosyalar (194 Bytes)

TÜBİTAK ULAKBİM

İLETİŞİM

An Annotated Corpus for Turkish Sentiment Analysis at Sentence Level

Oluşturanlar

Açıklama

Dosyalar

bib-9896bc69-9ed5-4227-a523-fce9135f50c8.txt

Dosyalar (194 Bytes)