Published January 1, 2017
| Version v1
Conference paper
Open
An Annotated Corpus for Turkish Sentiment Analysis at Sentence Level
- 1. Fac Engn, Dept Comp Engn, TR-41380 Kocaeli, Turkey
Description
With the rapid growth of unstructured data accessible via web, managing these data and finding undiscovered information in huge dataset become a necessary task. Consequently text mining, which can be defined as gleaning important information from natural language text, has emerged. In this study, in order to facilitate information management for aspect based sentiment analysis studies, a Turkish sentiment corpus, which is comprised of user reviews and is annotated semi-automatically, is constructed. In the constructed corpus, the root form of the words, the usage (aspect/multiaspect/seedsentiment/absent) of these words, Part of Speech (POS) tags and their polarities are defined. Turkish hotel review dataset which contains 1000 reviews and 5364 sentences for this study was crawled from a web source. The system takes reviews, aspect and seedsentiment lists and returns JSON data structures of the annotated corpus. In this paper, both we provide a ready to use dataset for developing aspect based sentiment analysis applications and we make this dataset easy to use for Java applications by creating JSON data.
Files
bib-9896bc69-9ed5-4227-a523-fce9135f50c8.txt
Files
(194 Bytes)
| Name | Size | Download all |
|---|---|---|
|
md5:440e8b6e955c5e41df992a81a3d5ccaf
|
194 Bytes | Preview Download |