Dergi makalesi Açık Erişim

Machine Learning-Based Text Classification Comparison: Turkish Language Context

   Alzoubi, Yehia Ibrahim; Topcu, Ahmet E.; Erkaya, Ahmed Enis

The growth in textual data associated with the increased usage of online services and the simplicity of having access to these data has resulted in a rise in the number of text classification research papers. Text classification has a significant influence on several domains such as news categorization, the detection of spam content, and sentiment analysis. The classification of Turkish text is the focus of this work since only a few studies have been conducted in this context. We utilize data obtained from customers' inquiries that come to an institution to evaluate the proposed techniques. Classes are assigned to such inquiries specified in the institution's internal procedures. The Support Vector Machine, Naive Bayes, Long Term-Short Memory, Random Forest, and Logistic Regression algorithms were used to classify the data. The performance of the various techniques was then analyzed after and before data preparation, and the results were compared. The Long Term-Short Memory technique demonstrated superior effectiveness in terms of accuracy, achieving an 84% accuracy rate, surpassing the best accuracy record of traditional techniques, which was 78% accuracy for the Support Vector Machine technique. The techniques performed better once the number of categories in the dataset was reduced. Moreover, the findings show that data preparation and coherence between the classes' number and the number of training sets are significant variables influencing the techniques' performance. The findings of this study and the text classification technique utilized may be applied to data in dialects other than Turkish.

Dosyalar (168 Bytes)
Dosya adı Boyutu
bib-0e42019c-ec1f-4852-b9d1-15c412a74a27.txt
md5:1f7e0019d9f29796728c1ada62e34998
168 Bytes İndir
15
10
görüntülenme
indirilme
Görüntülenme 15
İndirme 10
Veri hacmi 1.7 kB
Tekil görüntülenme 14
Tekil indirme 9
Atıflar
  • Citation Indexes: 17
Okunma İstatistikleri
  • Readers: 35
Alıntılar
  • Blog Mentions: 1
  • News Mentions: 1

Alıntı yap