Please use this identifier to cite or link to this item:
Title: Denoised senone I-Vectors for robust speaker verification
Authors: Tan, Z 
Mak, MW 
Mak, BKW
Zhu, Y
Keywords: Deep learning
Denoising autoencoders
Noise robustness
Phonetically discriminative features
Senone posteriors
Speaker verification
Issue Date: 2018
Publisher: Institute of Electrical and Electronics Engineers
Source: IEEE/ACM transactions on audio, speech, and language processing, 2018, v. 26, no. 4, 8269399, p. 820-830 How to cite?
Journal: IEEE/ACM transactions on audio, speech, and language processing 
Abstract: Recently, it has been shown that senone i-vectors, whose posteriors are produced by senone deep neural networks (DNNs), outperform the conventional Gaussian mixture model (GMM) i-vectors in both speaker and language recognition tasks. The success of senone i-vectors relies on the capability of the DNN to incorporate phonetic information into the i-vector extraction process. In this paper, we argue that to apply senone i-vectors in noisy environments, it is important to robustify the phonetically discriminative acoustic features and senone posteriors estimated by the DNN. To this end, we propose a deep architecture formed by stacking a deep belief network on top of a denoising autoencoder (DAE). After backpropagation fine-tuning, the network, referred to as denoising autoencoder-deep neural network (DAE-DNN), facilitates the extraction of robust phonetically-discriminitive bottleneck (BN) features and senone posteriors for i-vector extraction. We refer to the resulting i-vectors as denoised BN-based senone i-vectors. Results on NIST 2012 SRE show that senone i-vectors outperform the conventional GMM i-vectors. More interestingly, the BN features are not only phonetically discriminative, results suggest that they also contain sufficient speaker information to produce BN-based senone i-vectors that outperform the conventional senone i-vectors. This work also shows that DAE training is more beneficial to BN feature extraction than senone posterior estimation.
ISSN: 2329-9290
EISSN: 2329-9304
DOI: 10.1109/TASLP.2018.2796843
Appears in Collections:Journal/Magazine Article

View full-text via PolyU eLinks SFX Query
Show full item record

Page view(s)

Citations as of Sep 18, 2018

Google ScholarTM



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.