SNR-invariant multitask deep neural networks for robust speaker verification

Yao, Q; Mak, MW

doi:10.1109/LSP.2018.2870726

Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/107190

DC Field	Value	Language
dc.contributor	Department of Electrical and Electronic Engineering	-
dc.creator	Yao, Q	-
dc.creator	Mak, MW	-
dc.date.accessioned	2024-06-13T01:04:29Z	-
dc.date.available	2024-06-13T01:04:29Z	-
dc.identifier.issn	1070-9908	-
dc.identifier.uri	http://hdl.handle.net/10397/107190	-
dc.language.iso	en	en_US
dc.publisher	Institute of Electrical and Electronics Engineers	en_US
dc.rights	© 2018 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.	en_US
dc.rights	The following publication Q. Yao and M. -W. Mak, "SNR-Invariant Multitask Deep Neural Networks for Robust Speaker Verification," in IEEE Signal Processing Letters, vol. 25, no. 11, pp. 1670-1674, Nov. 2018 is available at https://doi.org/10.1109/LSP.2018.2870726.	en_US
dc.subject	Deep learning	en_US
dc.subject	I-vectors	en_US
dc.subject	Multitask learning	en_US
dc.subject	Noise robustness	en_US
dc.subject	Speaker verification	en_US
dc.title	SNR-invariant multitask deep neural networks for robust speaker verification	en_US
dc.type	Journal/Magazine Article	en_US
dc.identifier.spage	1670	-
dc.identifier.epage	1674	-
dc.identifier.volume	25	-
dc.identifier.issue	11	-
dc.identifier.doi	10.1109/LSP.2018.2870726	-
dcterms.abstract	A major challenge in speaker verification is to achieve low error rates under noisy environments. We observed that background noise in utterances will not only enlarge the speaker-dependent i-vector clusters but also shift the clusters, with the amount of shift depending on the signal-to-noise ratio (SNR) of the utterances. To overcome this SNR-dependent clustering phenomenon, we propose two deep neural network (DNN) architectures: hierarchical regression DNN (H-RDNN) and multitask DNN (MT-DNN). The H-RDNN is formed by stacking two regression DNNs in which the lower DNN is trained to map noisy i-vectors to their respective speaker-dependent cluster means of clean i-vectors and the upper DNN aims to regularize the outliers that cannot be denoised properly by the lower DNN. The MT-DNN is trained to denoise i-vectors (main task) and classify speakers (auxiliary task). The network leverages the auxiliary task to retain speaker information in the denoised i-vectors. Experimental results suggest that these two DNN architectures together with the PLDA backend significantly outperform the multicondition PLDA model and mixtures of PLDA, and that multitask learning helps to boost verification performance.	-
dcterms.accessRights	open access	en_US
dcterms.bibliographicCitation	IEEE signal processing letters, Nov. 2018, v. 25, no. 11, p. 1670-1674	-
dcterms.isPartOf	IEEE signal processing letters	-
dcterms.issued	2018-11	-
dc.identifier.scopus	2-s2.0-85053349975	-
dc.identifier.eissn	1558-2361	-
dc.description.validate	202403 bckw	-
dc.description.oa	Accepted Manuscript	en_US
dc.identifier.FolderNumber	EIE-0455	en_US
dc.description.fundingSource	RGC	en_US
dc.description.fundingSource	Others	en_US
dc.description.fundingText	HKPolyU	en_US
dc.description.pubStatus	Published	en_US
dc.identifier.OPUS	20150464	en_US
dc.description.oaCategory	Green (AAM)	en_US
Appears in Collections:	Journal/Magazine Article

Files in This Item:

File	Description	Size	Format
Yao_Snr-Invariant_Multitask_Deep.pdf	Pre-Published version	814.52 kB	Adobe PDF	View/Open

Open Access Information

Status	open access
File Version	Final Accepted Manuscript

Access

View full-text via PolyU eLinks

Show simple item record

Page views

4

Citations as of Jun 30, 2024

Downloads

1

Citations as of Jun 30, 2024

SCOPUS^TM
Citations

7

Citations as of Jun 21, 2024

WEB OF SCIENCE^TM
Citations

6

Citations as of Jun 27, 2024

Google Scholar^TM

Check

Files in This Item:

Open Access Information

Access

Page views

Downloads

SCOPUSTM Citations

WEB OF SCIENCETM Citations

Google ScholarTM

Altmetric

SCOPUS^TM
Citations

WEB OF SCIENCE^TM
Citations

Google Scholar^TM