SPEECH DETECTION ON BROADCAST AUDIO

Zubari, Unal; Ozan, Ezgi Can; Acar, Banu Oskay; Ciloglu, Tolga; Esen, Ersin; Ates, Tugrul K.; Onur, Duygu Oskay

doi:10.81043/aperta.92905

Konferans bildirisi Açık Erişim

SPEECH DETECTION ON BROADCAST AUDIO

2010 Zubari, Unal; Ozan, Ezgi Can; Acar, Banu Oskay; Ciloglu, Tolga; Esen, Ersin; Ates, Tugrul K.; Onur, Duygu Oskay

Speech boundary detection contributes to performance of speech based applications such as speech recognition and speaker recognition. Speech boundary detector implemented in this study works on broadcast audio as a pre-processor module of a keyword spotter. Speech boundary detection is handled in 3 steps. At first step, audio data is segmented into homogeneous regions in an unsupervised manner. After an ACTIVITY/NON-ACTIVITY decision is made for each region, ACTIVITY regions are classified as Speech/Non-speech via Gaussian Mixture Model (GMM) based classification. GMM's are trained using a novel feature, Spectral Flow Direction (SFD), and an improved multi-band harmonicity feature in addition to widely used Mel Frequency Cepstral Coefficients (MFCC's).

Dosyalar (177 Bytes)

Dosya adı	Boyutu
bib-a2d37fe5-c5aa-4581-9110-d4f29c74467e.txt md5:8b97a08e268a47179dc0ea9d801acb76	177 Bytes	İndir

görüntülenme

indirilme

Daha fazla ayrıntı...

Görüntülenme	29
İndirme	10
Veri hacmi	1.8 kB
Tekil görüntülenme	28
Tekil indirme	10

Kayıt Bilgileri

Yayınlanma tarihi:: 01/01/2010
Konferans Bilgileri:: 18TH EUROPEAN SIGNAL PROCESSING CONFERENCE (EUSIPCO-2010)
Lisans:: Creative Commons Attribution

SPEECH DETECTION ON BROADCAST AUDIO

SPEECH DETECTION ON BROADCAST AUDIO

Kayıt Bilgileri

Alıntı yap

Paylaş

Dışa aktar

TÜBİTAK ULAKBİM

İLETİŞİM