The Noisy Channel Mode for Unsupervised Word Sense Disambiguation

Yuret, Deniz; Yatbaz, Mehmet Ali

doi:10.48623/aperta.23381

Yayınlanmış 1 Ocak 2010 | Sürüm v1

Dergi makalesi Açık

The Noisy Channel Mode for Unsupervised Word Sense Disambiguation

1. Koc Univ, Dept Comp Engn, TR-34450 Istanbul, Turkey

We introduce a generative probabilistic model, the noisy channel model, for unsupervised word sense disambiguation. In our model, each context C is modeled as a distinct channel through which the speaker intends to transmit a particular meaning S using a possibly ambiguous word W. To reconstruct the intended meaning the hearer uses the distribution of possible meanings in the given context P(S|C) and possible words that can express each meaning P(W|S). We assume P(W|S) is independent of the context and estimate it using WordNet sense frequencies. The main problem of unsupervised WSD is estimating context-dependent P(S|C) without access to any sense-tagged text. We show one way to solve this problem using a statistical language model based on large amounts of untagged text. Our model uses coarse-grained semantic classes for S internally and we explore the effect of using different levels of granularity on WSD performance. The system outputs fine-grained senses for evaluation, and its performance on noun disambiguation is better than most previously reported unsupervised systems and close to the best supervised systems.

Dosyalar

bib-9aaa8797-a4d6-4458-8bae-a44eef0f0288.txt

Dosyalar (140 Bytes)

Ad	Boyut	Hepisini indir
bib-9aaa8797-a4d6-4458-8bae-a44eef0f0288.txt md5:a859258798db5705f8cff78f0869753d	140 Bytes	Ön İzleme İndir

	Tüm sürümler	Bu sürüm
Görüntüleme	59	59
İndirilenler	14	14
Veri miktarı	2.0 kB	2.0 kB

The Noisy Channel Mode for Unsupervised Word Sense Disambiguation

Dosyalar

bib-9aaa8797-a4d6-4458-8bae-a44eef0f0288.txt

Dosyalar (140 Bytes)

TÜBİTAK ULAKBİM

İLETİŞİM

The Noisy Channel Mode for Unsupervised Word Sense Disambiguation

Oluşturanlar

Açıklama

Dosyalar

bib-9aaa8797-a4d6-4458-8bae-a44eef0f0288.txt

Dosyalar (140 Bytes)