SNR-invariant PLDA modeling in nonparametric subspace for robust speaker verification

Li, N; Mak, MW

doi:10.1109/TASLP.2015.2442757

Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/16012

Title:	SNR-invariant PLDA modeling in nonparametric subspace for robust speaker verification
Authors:	Li, N Mak, MW
Issue Date:	Oct-2015
Source:	IEEE transactions on audio, speech and language processing, Oct. 2015, v. 23, no. 10, p. 1648-1659
Abstract:	While i-vector/PLDA framework has achieved great success, its performance still degrades dramatically under noisy conditions. To compensate for the variability of i-vectors caused by different levels of background noise, this paper proposes an SNR-invariant PLDA framework for robust speaker verification. First, nonparametric feature analysis (NFA) is employed to suppress intra-speaker variation and emphasize the discriminative information inherited in the boundaries between speakers in the i-vector space. Then, in the NFA-projected subspace, SNR-invariant PLDA is applied to separate the SNR-specific information from speaker-specific information using an identity factor and an SNR factor. Accordingly, a projected i-vector in the NFA subspace can be represented as a linear combination of three components: speaker, SNR, and channel. During verification, the variability due to SNR and channels are integrated out when computing the marginal likelihood ratio. Experiments based on NIST 2012 SRE show that the proposed framework achieves superior performance when compared with the conventional PLDA and SNR-dependent mixture of PLDA.
Keywords:	i-vector Nonparametric feature analysis Probabilistic linear discriminant analysis (PLDA) SNR-invariant Speaker verification
Publisher:	Institute of Electrical and Electronics Engineers
Journal:	IEEE transactions on audio, speech and language processing
ISSN:	1558-7916
EISSN:	1558-7924
DOI:	10.1109/TASLP.2015.2442757
Rights:	© 2015 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The following publication Li, N., & Mak, M. W. (2015). SNR-invariant PLDA modeling in nonparametric subspace for robust speaker verification. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 23(10), 1648-1659 iis available at https://doi.org/10.1109/TASLP.2015.2442757.
Appears in Collections:	Journal/Magazine Article

Files in This Item:

File	Description	Size	Format
SNR-invariant_PLDmodeling_Nonparametric.pdf	Pre-Published version	690.07 kB	Adobe PDF	View/Open

Open Access Information

Status	open access
File Version	Final Accepted Manuscript

Access

View full-text via PolyU eLinks

Show full item record

Page views

102

Last Week
1

Last month

Citations as of Apr 14, 2024

Downloads

29

Citations as of Apr 14, 2024

SCOPUS^TM
Citations

29

Last Week
1

Last month
0

Citations as of Apr 12, 2024

WEB OF SCIENCE^TM
Citations

24

Last Week
0

Last month
0

Citations as of Apr 18, 2024

Google Scholar^TM

Check

Files in This Item:

Open Access Information

Access

Page views

Downloads

SCOPUSTM Citations

WEB OF SCIENCETM Citations

Google ScholarTM

Altmetric

SCOPUS^TM
Citations

WEB OF SCIENCE^TM
Citations

Google Scholar^TM