Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/111708
PIRA download icon_1.1View/Download Full Text
Title: UNet-DenseNet for robust far-field speaker verification
Authors: Gao, Z 
Mak, MW 
Lin, W 
Issue Date: 2022
Source: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2022, p. 3714-3718
Abstract: Far-field speaker verification (SV) has always been critical but challenging. Data augmentation is commonly used to overcome the problems arising from far-field microphones, such as high background noise levels and reverberation effects. On top of data augmentation, this paper tackles these problems by introducing a UNet-based speech enhancement (SE) module as a front-end processor for the speaker embedding module. To prevent the SE module from distorting speaker information, we propose two improvements to the speech enhancement–speaker embedding pipeline. (1) A UNet-DenseNet joint training scheme in which the UNet is optimized by both the MSE and speaker classification losses. (2) A semi-joint training scheme that stops the UNet training but continues the DenseNet training when overfitting of the UNet is detected. Extensive experiments on noise-contaminated Voxceleb1 and the VOiCES Challenge 2019 demonstrate the effectiveness of the two training schemes.
Publisher: International Speech Communication Association
DOI: 10.21437/Interspeech.2022-10350
Description: 23rd Annual Conference of the International Speech Communication Association, INTERSPEECH 2022, Incheon, Korea, September 18-22, 2022
Rights: Copyright © 2022 ISCA
The following publication Gao, Z., Mak, M., Lin, W. (2022) UNet-DenseNet for Robust Far-Field Speaker Verification. Proc. Interspeech 2022, 3714-3718 is available at https://doi.org/10.21437/Interspeech.2022-10350.
Appears in Collections:Conference Paper

Files in This Item:
File Description SizeFormat 
gao22c_interspeech.pdf815.29 kBAdobe PDFView/Open
Open Access Information
Status open access
File Version Version of Record
Access
View full-text via PolyU eLinks SFX Query
Show full item record

Page views

5
Citations as of Apr 14, 2025

Downloads

3
Citations as of Apr 14, 2025

SCOPUSTM   
Citations

11
Citations as of Sep 19, 2025

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.