Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/111708
PIRA download icon_1.1View/Download Full Text
Title: UNet-DenseNet for robust far-field speaker verification
Authors: Gao, Z 
Mak, MW 
Lin, W 
Issue Date: 2022
Source: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2022, p. 3714-3718
Abstract: Far-field speaker verification (SV) has always been critical but challenging. Data augmentation is commonly used to overcome the problems arising from far-field microphones, such as high background noise levels and reverberation effects. On top of data augmentation, this paper tackles these problems by introducing a UNet-based speech enhancement (SE) module as a front-end processor for the speaker embedding module. To prevent the SE module from distorting speaker information, we propose two improvements to the speech enhancement–speaker embedding pipeline. (1) A UNet-DenseNet joint training scheme in which the UNet is optimized by both the MSE and speaker classification losses. (2) A semi-joint training scheme that stops the UNet training but continues the DenseNet training when overfitting of the UNet is detected. Extensive experiments on noise-contaminated Voxceleb1 and the VOiCES Challenge 2019 demonstrate the effectiveness of the two training schemes.
Publisher: International Speech Communication Association
DOI: 10.21437/Interspeech.2022-10350
Description: 23rd Annual Conference of the International Speech Communication Association, INTERSPEECH 2022, Incheon, Korea, September 18-22, 2022
Rights: Copyright © 2022 ISCA
The following publication Gao, Z., Mak, M., Lin, W. (2022) UNet-DenseNet for Robust Far-Field Speaker Verification. Proc. Interspeech 2022, 3714-3718 is available at https://doi.org/10.21437/Interspeech.2022-10350.
Appears in Collections:Conference Paper

Files in This Item:
File Description SizeFormat 
gao22c_interspeech.pdf815.29 kBAdobe PDFView/Open
Open Access Information
Status open access
File Version Version of Record
Access
View full-text via PolyU eLinks SFX Query
Show full item record

Page views

11
Citations as of Apr 1, 2025

Downloads

7
Citations as of Apr 1, 2025

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.