Dergi makalesi Açık Erişim

Investigation of Luhn's claim on information retrieval

   Kocabas, Ilker; Dincer, Bekir Taner; Karaoglan, Bahar

In this study, we show how Luhn's claim about the degree of importance of a word in a document can be related to information retrieval. His basic idea is transformed into z -scores as the weights of terms for the purpose of modeling terra frequency (If) within documents. The Luhn-based models represented in this paper are considered as the TF component of proposed TF x IDF weighing schemes. Moreover, the final term weighting functions appropriate for the TF x IDF weighting scheme are applied to TREC-6, -7, and -8 databases. The experimental results show relevance to Luhn's claim by having high mean average precision (MAP) for the terms with frequencies around the mean frequency of terms within a document. On the other hand, the weighting, which significantly discriminates the importance between low/high frequencies and medium frequencies, degrades the retrieval performance. Therefore, any weighting scheme (TF) that is directly proportional to If has a probability of high retrieval performance, if this can optimally indicate the difference of the importance regarding tf values and also optimally eliminate the terms that have high frequencies.

Dosyalar (185 Bytes)
Dosya adı Boyutu
bib-7057e26d-903c-4df2-974c-f6332de2faa7.txt
md5:515575fc6f9f9f393becfc2e1a4e9335
185 Bytes İndir
24
5
görüntülenme
indirilme
Görüntülenme 24
İndirme 5
Veri hacmi 925 Bytes
Tekil görüntülenme 24
Tekil indirme 5

Alıntı yap