Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/107247
PIRA download icon_1.1View/Download Full Text
Title: Deep neural network driven mixture of PLDA for robust i-vector speaker verification
Authors: Li, N 
Mak, MW 
Chien, JT
Issue Date: 2016
Source: In Proceedings of 2016 IEEE Spoken Language Technology Workshop (SLT), 13-16 December 2016, San Diego, CA, USA
Abstract: In speaker recognition, the mismatch between the enrollment and test utterances due to noise with different signal-to-noise ratios (SNRs) is a great challenge. Based on the observation that noise-level variability causes the i-vectors to form heterogeneous clusters, this paper proposes using an SNR-aware deep neural network (DNN) to guide the training of PLDA mixture models. Specifically, given an i-vector, the SNR posterior probabilities produced by the DNN are used as the posteriors of indicator variables of the mixture model. As a result, the proposed model provides a more reasonable soft division of the i-vector space compared to the conventional mixture of PLDA. During verification, given a test trial, the marginal likelihoods from individual PLDA models are linearly combined by the posterior probabilities of SNR levels computed by the DNN. Experimental results for SNR mismatch tasks based on NIST 2012 SRE suggest that the proposed model is more effective than PLDA and conventional mixture of PLDA for handling heterogeneous corpora.
Keywords: Deep neural networks
I-vector
Mixture of PLDA
SNR mismatch
Speaker verification
Publisher: Institute of Electrical and Electronics Engineers
ISBN: 978-1-5090-4903-5 (Electronic)
978-1-5090-4904-2 (Print on Demand(PoD))
DOI: 10.1109/SLT.2016.7846263
Description: 2016 IEEE Spoken Language Technology Workshop (SLT), 13-16 December 2016, San Diego, CA, USA
Rights: ©2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
The following publication N. Li, M. -W. Mak and J. -T. Chien, "Deep neural network driven mixture of PLDA for robust i-vector speaker verification," 2016 IEEE Spoken Language Technology Workshop (SLT), San Diego, CA, USA, 2016, pp. 186-191 is available at https://doi.org/10.1109/SLT.2016.7846263.
Appears in Collections:Conference Paper

Files in This Item:
File Description SizeFormat 
Mak_Deep_Neural_Network.pdfPre-Published version375.59 kBAdobe PDFView/Open
Open Access Information
Status open access
File Version Final Accepted Manuscript
Access
View full-text via PolyU eLinks SFX Query
Show full item record

Page views

7
Citations as of Jun 30, 2024

SCOPUSTM   
Citations

10
Citations as of Jun 21, 2024

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.