SNR-invariant PLDA with multiple speaker subspaces

Li, N; Mak, MW

doi:10.1109/ICASSP.2016.7472742

Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/107270

Title:	SNR-invariant PLDA with multiple speaker subspaces
Authors:	Li, N Mak, MW
Issue Date:	2016
Source:	In Proceedings of 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 20-25 March 2016, Shanghai, China, p. 5565-5569
Abstract:	To deal with the mismatch between the enrollment and test utterances caused by noise with different signal-to-noise ratios (SNR), we have recently proposed an SNR-invariant PLDA model for robust speaker verification. In the model, SNR-specific information were separated from speaker-specific information through marginalizing out the SNR factors during the scoring process. However, this modeling approach assumes that speaker variabilities can be captured by a single speaker subspace regardless of the noise level of the utterances. We will show in this paper that i-vectors extracted from utterances with different noise levels will shift to different regions of the i-vector space and that i-vectors extracted from utterances having similar SNR tend to cluster together. In view of this observation, we propose introducing multiple speaker subspaces to the SNR-invariance PLDA model and use multiple covariance matrices to represent SNR-dependent channel variability. Through NIST 2012 SRE, this paper demonstrates that this finer and more precise modeling of speaker and SNR variabilities leads to better performance when compared with the conventional PLDA and SNR-invariant PLDA.
Keywords:	I-vectors SNR subspaces SNR-invariant PLDA Speaker subspaces Speaker verification
Publisher:	Institute of Electrical and Electronics Engineers
ISBN:	978-1-4799-9988-0 (Electronic) 978-1-4799-9987-3 (USB)
DOI:	10.1109/ICASSP.2016.7472742
Description:	2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 20-25 March 2016, Shanghai, China
Rights:	©2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The following publication N. Li and M. -W. Mak, "SNR-invariant PLDA with multiple speaker subspaces," 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Shanghai, China, 2016, pp. 5565-5569 is available at https://doi.org/10.1109/ICASSP.2016.7472742.
Appears in Collections:	Conference Paper

Files in This Item:

File	Description	Size	Format
Mak_Snr-Invariant_Plda_Multiple.pdf	Pre-Published version	938.8 kB	Adobe PDF	View/Open

Open Access Information

Status	open access
File Version	Final Accepted Manuscript

Access

View full-text via PolyU eLinks

Show full item record

Page views

169

Last Week
2

Last month

Citations as of Apr 12, 2026

Downloads

75

Citations as of Apr 12, 2026

SCOPUS^TM
Citations

7

Citations as of May 8, 2026

Google Scholar^TM

Check