Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/97219
PIRA download icon_1.1View/Download Full Text
DC FieldValueLanguage
dc.contributorDepartment of Electronic and Information Engineeringen_US
dc.creatorLin, Wen_US
dc.creatorMak, MWen_US
dc.date.accessioned2023-02-20T06:16:37Z-
dc.date.available2023-02-20T06:16:37Z-
dc.identifier.issn2329-9290en_US
dc.identifier.urihttp://hdl.handle.net/10397/97219-
dc.language.isoenen_US
dc.publisherInstitute of Electrical and Electronics Engineersen_US
dc.rights© 2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.en_US
dc.rightsThe following publication W. Lin and M. -W. Mak, "Robust Speaker Verification Using Deep Weight Space Ensemble," in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 31, pp. 802-812, 2023 is available at https://doi.org/10.1109/TASLP.2022.3233231.en_US
dc.subjectRobust speaker recognitionen_US
dc.subjectDomain adaptationen_US
dc.subjectDomain shiften_US
dc.subjectWeight space ensembleen_US
dc.titleRobust speaker verification using deep weight space ensembleen_US
dc.typeJournal/Magazine Articleen_US
dc.identifier.spage802en_US
dc.identifier.epage812en_US
dc.identifier.volume31en_US
dc.identifier.doi10.1109/TASLP.2022.3233231en_US
dcterms.abstractDomain shift is one of the most challenging problems in speaker verification. Although numerous methods have been proposed to address domain shift, most approaches optimize the performance of one domain at the sacrifice of the other. As a result, to obtain the best performance, each domain requires a dedicated model. However, deploying multiple models is resource-demanding and impractical, particularly when the deployment domains are not known in advance. Recent studies in deep neural networks (DNNs) suggest that near the low error surface of the DNN's weight space, there exists a linear path connecting a base model and a fine-tuned model. This finding inspires us to combine the strength of the fine-tuned models and the base models to solve challenging SV problems. Specifically, we aim to develop models that can handle 1) mixed text-dependent (TD) and text-independent (TI) speaker verification where the speech content can be either unconstrained or constrained, 2) cross-channel speaker verification where the recording can be 16 kHz high-fidelity microphone speech or 8 kHz telephone speech, and 3) bi-lingual speaker verification where the enrollment and test speech can be one of the two languages. With weight space ensemble, we show that we can substantially improve the tasks mentioned above, with a 39.6% improvement in mixing TD and TI SV, a 17.4% improvement in bi-lingual SV, and an 18.4% improvement in cross-channel SV. Moreover, we show that the weight space ensemble can also enhance the performance in the target domain, thanks to the regularization effect of the interpolation.en_US
dcterms.accessRightsopen accessen_US
dcterms.bibliographicCitationIEEE/ACM transactions on audio, speech, and language processing, 2023, v. 31, p. 802-812en_US
dcterms.isPartOfIEEE/ACM transactions on audio, speech, and language processingen_US
dcterms.issued2023-
dc.identifier.eissn2329-9304en_US
dc.description.validate202302 bcwwen_US
dc.description.oaAccepted Manuscripten_US
dc.identifier.FolderNumbera1929-
dc.identifier.SubFormID46147-
dc.description.fundingSourceOthersen_US
dc.description.fundingTextNational Natural Science Foundation of Chinaen_US
dc.description.pubStatusPublisheden_US
dc.description.oaCategoryGreen (AAM)en_US
Appears in Collections:Journal/Magazine Article
Files in This Item:
File Description SizeFormat 
weight_ensemble.pdfPre-Published version1.33 MBAdobe PDFView/Open
Open Access Information
Status open access
File Version Final Accepted Manuscript
Access
View full-text via PolyU eLinks SFX Query
Show simple item record

Page views

72
Citations as of Apr 14, 2025

Downloads

89
Citations as of Apr 14, 2025

SCOPUSTM   
Citations

5
Citations as of Jun 21, 2024

WEB OF SCIENCETM
Citations

2
Citations as of Oct 10, 2024

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.