DNN-BASED SPEAKER-ADAPTIVE POSTFILTERING WITH LIMITED ADAPTATION DATA FOR STATISTICAL SPEECH SYNTHESIS SYSTEMS

Ozturk, Mirac Goksu; Ulusoy, Okan; Demiroglu, Cenk

doi:10.48623/aperta.73701

Yayınlanmış 1 Ocak 2019 | Sürüm v1

Konferans bildirisi Açık

DNN-BASED SPEAKER-ADAPTIVE POSTFILTERING WITH LIMITED ADAPTATION DATA FOR STATISTICAL SPEECH SYNTHESIS SYSTEMS

1. Bogazici Univ, Comp Engn, Istanbul, Turkey
2. Bogazici Univ, Elect & Elect Engn, Istanbul, Turkey

Deep neural networks (DNNs) have been successfully deployed for acoustic modelling in statistical parametric speech synthesis (SPSS) systems. Moreover, DNN-based postfilters (PF) have also been shown to outperform conventional postfilters that are widely used in SPSS systems for increasing the quality of synthesized speech. However, existing DNN-based postfilters are trained with speaker-dependent databases. Given that SPSS systems can rapidly adapt to new speakers from generic models, there is a need for DNN-based postfilters that can adapt to new speakers with minimal adaptation data. Here, we compare DNN-, RNN-, and CNN-based postfilters together with adversarial (GAN) training and cluster-based initialization (CI) for rapid adaptation. Results indicate that the feedforward (FF) DNN, together with GAN and CI, significantly outperforms the other recently proposed postfilters.

Dosyalar

bib-fc7fa2b8-b570-4b00-9831-4f8833a215e7.txt

Dosyalar (245 Bytes)

Ad	Boyut	Hepisini indir
bib-fc7fa2b8-b570-4b00-9831-4f8833a215e7.txt md5:63f6d735c8a85450a2ae3ea6d487dc10	245 Bytes	Ön İzleme İndir

	Tüm sürümler	Bu sürüm
Görüntüleme	67	67
İndirilenler	13	13
Veri miktarı	3.2 kB	3.2 kB

DNN-BASED SPEAKER-ADAPTIVE POSTFILTERING WITH LIMITED ADAPTATION DATA FOR STATISTICAL SPEECH SYNTHESIS SYSTEMS

Dosyalar

bib-fc7fa2b8-b570-4b00-9831-4f8833a215e7.txt

Dosyalar (245 Bytes)

TÜBİTAK ULAKBİM

İLETİŞİM

DNN-BASED SPEAKER-ADAPTIVE POSTFILTERING WITH LIMITED ADAPTATION DATA FOR STATISTICAL SPEECH SYNTHESIS SYSTEMS

Oluşturanlar

Açıklama

Dosyalar

bib-fc7fa2b8-b570-4b00-9831-4f8833a215e7.txt

Dosyalar (245 Bytes)