Improving Turkish Telephone Speech Recognition with Data Augmentation and Out of Domain Data

Uslu, Zeynep Gulhan; Yildirim, Tulay

doi:10.81043/aperta.99851

Published January 1, 2019 | Version v1

Conference paper Open

Improving Turkish Telephone Speech Recognition with Data Augmentation and Out of Domain Data

1. Yildiz Tech Univ, TUBITAK BILGEM, Elect & Commun Engn Dept, Istanbul, Turkey
2. Yildiz Tech Univ, Elect & Commun Engn Dept, Istanbul, Turkey

In this paper, we investigate the effects of data augmentation and adding out of domain data on Turkish spontaneous speech recognition. We apply different acoustic model training techniques including Gaussian Mixture Models, Deep Neural Network and Time Delay Neural Network to Babel Turkish spontaneous telephone speech data. We find that Time Delay Neural Network with iVectors based acoustic model performs the best result. We demonstrate the effect of data augmentation by adding speed and volume perturbation applied data in training. We investigate the effect of increasing acoustic model training data by including two call center data. We increase training data by adding about 100 hours of modified out of domain broadcast data. We also examine the effect of neural network based language modeling techniques like Recurrent Neural Network language models.

Files

bib-ed79b1b8-0801-42ce-9379-b55dce7d9eed.txt

Files (202 Bytes)

Name	Size	Download all
bib-ed79b1b8-0801-42ce-9379-b55dce7d9eed.txt md5:d4c8b28695e355d1096c1d95e699b387	202 Bytes	Preview Download

	All versions	This version
Views	31	31
Downloads	14	14
Data volume	2.8 kB	2.8 kB

Improving Turkish Telephone Speech Recognition with Data Augmentation and Out of Domain Data

Files

bib-ed79b1b8-0801-42ce-9379-b55dce7d9eed.txt

Files (202 Bytes)

TÜBİTAK ULAKBİM

CONTACT

Improving Turkish Telephone Speech Recognition with Data Augmentation and Out of Domain Data

Creators

Description

Files

bib-ed79b1b8-0801-42ce-9379-b55dce7d9eed.txt

Files (202 Bytes)