Robust speaker verification using deep weight space ensemble

Lin, W; Mak, MW

doi:10.1109/TASLP.2022.3233231

Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/97219

Title:	Robust speaker verification using deep weight space ensemble
Authors:	Lin, W Mak, MW
Issue Date:	2023
Source:	IEEE/ACM transactions on audio, speech, and language processing, 2023, v. 31, p. 802-812
Abstract:	Domain shift is one of the most challenging problems in speaker verification. Although numerous methods have been proposed to address domain shift, most approaches optimize the performance of one domain at the sacrifice of the other. As a result, to obtain the best performance, each domain requires a dedicated model. However, deploying multiple models is resource-demanding and impractical, particularly when the deployment domains are not known in advance. Recent studies in deep neural networks (DNNs) suggest that near the low error surface of the DNN's weight space, there exists a linear path connecting a base model and a fine-tuned model. This finding inspires us to combine the strength of the fine-tuned models and the base models to solve challenging SV problems. Specifically, we aim to develop models that can handle 1) mixed text-dependent (TD) and text-independent (TI) speaker verification where the speech content can be either unconstrained or constrained, 2) cross-channel speaker verification where the recording can be 16 kHz high-fidelity microphone speech or 8 kHz telephone speech, and 3) bi-lingual speaker verification where the enrollment and test speech can be one of the two languages. With weight space ensemble, we show that we can substantially improve the tasks mentioned above, with a 39.6% improvement in mixing TD and TI SV, a 17.4% improvement in bi-lingual SV, and an 18.4% improvement in cross-channel SV. Moreover, we show that the weight space ensemble can also enhance the performance in the target domain, thanks to the regularization effect of the interpolation.
Keywords:	Robust speaker recognition Domain adaptation Domain shift Weight space ensemble
Publisher:	Institute of Electrical and Electronics Engineers
Journal:	IEEE/ACM transactions on audio, speech, and language processing
ISSN:	2329-9290
EISSN:	2329-9304
DOI:	10.1109/TASLP.2022.3233231
Rights:	© 2023 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The following publication W. Lin and M. -W. Mak, "Robust Speaker Verification Using Deep Weight Space Ensemble," in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 31, pp. 802-812, 2023 is available at https://doi.org/10.1109/TASLP.2022.3233231.
Appears in Collections:	Journal/Magazine Article

Files in This Item:

File	Description	Size	Format
weight_ensemble.pdf	Pre-Published version	1.33 MB	Adobe PDF	View/Open

Open Access Information

Status	open access
File Version	Final Accepted Manuscript

Access

View full-text via PolyU eLinks

Show full item record

Page views

118

Citations as of Feb 9, 2026

Downloads

232

Citations as of Feb 9, 2026

SCOPUS^TM
Citations

5

Citations as of Jun 21, 2024

WEB OF SCIENCE^TM
Citations

2

Citations as of Oct 10, 2024

Google Scholar^TM

Check

Files in This Item:

Open Access Information

Access

Page views

Downloads

SCOPUSTM Citations

WEB OF SCIENCETM Citations

Google ScholarTM

Altmetric

SCOPUS^TM
Citations

WEB OF SCIENCE^TM
Citations

Google Scholar^TM