Published January 1, 2008
| Version v1
Conference paper
Open
Discriminative N-gram Language Modeling for Turkish
- 1. Bogazici Univ, Dept Elect & Elect Engn, Istanbul, Turkey
- 2. OGI OHSU, Ctr Spoken Language Understanding, Beaverton, OR USA
Description
In this paper Discriminative Language Models (DLMs) are applied to the Turkish Broadcast News transcription task. Turkish presents a challenge to Automatic Speech Recognition (ASR) systems due to its rich morphology. Therefore, in addition to word n-gram features, morphology based features like root n-grams and inflectional group n-grams are incorporated into DLMs in order to improve the language models. Various feature sets provide reductions in the word error rate (WIER). Our best result is obtained with the inflectional group n-gram features. 1.0% absolute improvement is achieved over the baseline model and this improvement is statistically significant at p<0.001 as measured by the NIST MAPSSWE significance test.
Files
bib-c60b719d-b5db-4c59-ab16-87263749f433.txt
Files
(219 Bytes)
| Name | Size | Download all |
|---|---|---|
|
md5:5e17aef2293c92ae8e98d78a3efbf94c
|
219 Bytes | Preview Download |