SNR-invariant PLDA modeling for robust speaker verification

Li, N; Mak, MW

doi:10.21437/interspeech.2015-502

Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/111714

DC Field	Value	Language
dc.contributor	Department of Electrical and Electronic Engineering	-
dc.creator	Li, N	-
dc.creator	Mak, MW	-
dc.date.accessioned	2025-03-13T02:22:11Z	-
dc.date.available	2025-03-13T02:22:11Z	-
dc.identifier.uri	http://hdl.handle.net/10397/111714	-
dc.description	16th Annual Conference of the International Speech Communication Association, INTERSPEECH 2015, Dresden, Germany, September 6-10, 2015	en_US
dc.language.iso	en	en_US
dc.publisher	International Speech Communication Association	en_US
dc.rights	Copyright © 2015 ISCA	en_US
dc.rights	The following publication Li, N., Mak, M.-W. (2015) SNR-invariant PLDA modeling for robust speaker verification. Proc. Interspeech 2015, 2317-2321 is available at https://doi.org/10.21437/Interspeech.2015-502.	en_US
dc.title	SNR-invariant PLDA modeling for robust speaker verification	en_US
dc.type	Conference Paper	en_US
dc.identifier.spage	2317	-
dc.identifier.epage	2321	-
dc.identifier.doi	10.21437/interspeech.2015-502	-
dcterms.abstract	In spite of the great success of the i-vector/PLDA framework, speaker verification in noisy environments remains a challenge. To compensate for the variability of i-vectors caused by different levels of background noise, this paper proposes a new framework, namely SNR-invariant PLDA, for robust speaker verification. By assuming that i-vectors extracted from utterances falling within a narrow SNR range share similar SNR-specific information, the paper introduces an SNR factor to the conventional PLDA model. Then, the SNR-related variability and the speaker-related variability embedded in the i-vectors are modeled by the SNR factor and the speaker factor, respectively. Accordingly, an i-vector is represented by a linear combination of three components: speaker, SNR, and channel. During verification, the variability due to SNR and channels are marginalized out when computing the marginal likelihood ratio. Experiments based on NIST 2012 SRE show that SNR-invariant PLDA achieves superior performance when compared with the conventional PLDA and SNR-dependent mixture of PLDA.	-
dcterms.accessRights	open access	en_US
dcterms.bibliographicCitation	Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2015, p. 2317-2321	-
dcterms.issued	2015	-
dc.identifier.scopus	2-s2.0-84959154254	-
dc.relation.conference	Conference of the International Speech Communication Association [INTERSPEECH]	-
dc.description.validate	202503 bcch	-
dc.description.oa	Version of Record	en_US
dc.identifier.FolderNumber	OA_Others	en_US
dc.description.fundingSource	RGC	en_US
dc.description.pubStatus	Published	en_US
dc.description.oaCategory	VoR allowed	en_US
Appears in Collections:	Conference Paper

Files in This Item:

File	Description	Size	Format
li15c_interspeech.pdf		310.24 kB	Adobe PDF	View/Open

Open Access Information

Status	open access
File Version	Version of Record

Access

View full-text via PolyU eLinks

Show simple item record

Page views

137

Citations as of Feb 9, 2026

Downloads

48

Citations as of Feb 9, 2026

SCOPUS^TM
Citations

8

Citations as of May 8, 2026

Google Scholar^TM

Check