Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/107190
PIRA download icon_1.1View/Download Full Text
DC FieldValueLanguage
dc.contributorDepartment of Electrical and Electronic Engineering-
dc.creatorYao, Q-
dc.creatorMak, MW-
dc.date.accessioned2024-06-13T01:04:29Z-
dc.date.available2024-06-13T01:04:29Z-
dc.identifier.issn1070-9908-
dc.identifier.urihttp://hdl.handle.net/10397/107190-
dc.language.isoenen_US
dc.publisherInstitute of Electrical and Electronics Engineersen_US
dc.rights© 2018 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.en_US
dc.rightsThe following publication Q. Yao and M. -W. Mak, "SNR-Invariant Multitask Deep Neural Networks for Robust Speaker Verification," in IEEE Signal Processing Letters, vol. 25, no. 11, pp. 1670-1674, Nov. 2018 is available at https://doi.org/10.1109/LSP.2018.2870726.en_US
dc.subjectDeep learningen_US
dc.subjectI-vectorsen_US
dc.subjectMultitask learningen_US
dc.subjectNoise robustnessen_US
dc.subjectSpeaker verificationen_US
dc.titleSNR-invariant multitask deep neural networks for robust speaker verificationen_US
dc.typeJournal/Magazine Articleen_US
dc.identifier.spage1670-
dc.identifier.epage1674-
dc.identifier.volume25-
dc.identifier.issue11-
dc.identifier.doi10.1109/LSP.2018.2870726-
dcterms.abstractA major challenge in speaker verification is to achieve low error rates under noisy environments. We observed that background noise in utterances will not only enlarge the speaker-dependent i-vector clusters but also shift the clusters, with the amount of shift depending on the signal-to-noise ratio (SNR) of the utterances. To overcome this SNR-dependent clustering phenomenon, we propose two deep neural network (DNN) architectures: hierarchical regression DNN (H-RDNN) and multitask DNN (MT-DNN). The H-RDNN is formed by stacking two regression DNNs in which the lower DNN is trained to map noisy i-vectors to their respective speaker-dependent cluster means of clean i-vectors and the upper DNN aims to regularize the outliers that cannot be denoised properly by the lower DNN. The MT-DNN is trained to denoise i-vectors (main task) and classify speakers (auxiliary task). The network leverages the auxiliary task to retain speaker information in the denoised i-vectors. Experimental results suggest that these two DNN architectures together with the PLDA backend significantly outperform the multicondition PLDA model and mixtures of PLDA, and that multitask learning helps to boost verification performance.-
dcterms.accessRightsopen accessen_US
dcterms.bibliographicCitationIEEE signal processing letters, Nov. 2018, v. 25, no. 11, p. 1670-1674-
dcterms.isPartOfIEEE signal processing letters-
dcterms.issued2018-11-
dc.identifier.scopus2-s2.0-85053349975-
dc.identifier.eissn1558-2361-
dc.description.validate202403 bckw-
dc.description.oaAccepted Manuscripten_US
dc.identifier.FolderNumberEIE-0455en_US
dc.description.fundingSourceRGCen_US
dc.description.fundingSourceOthersen_US
dc.description.fundingTextHKPolyUen_US
dc.description.pubStatusPublisheden_US
dc.identifier.OPUS20150464en_US
dc.description.oaCategoryGreen (AAM)en_US
Appears in Collections:Journal/Magazine Article
Files in This Item:
File Description SizeFormat 
Yao_Snr-Invariant_Multitask_Deep.pdfPre-Published version814.52 kBAdobe PDFView/Open
Open Access Information
Status open access
File Version Final Accepted Manuscript
Access
View full-text via PolyU eLinks SFX Query
Show simple item record

Page views

4
Citations as of Jun 30, 2024

Downloads

1
Citations as of Jun 30, 2024

SCOPUSTM   
Citations

7
Citations as of Jun 21, 2024

WEB OF SCIENCETM
Citations

6
Citations as of Jun 27, 2024

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.