Please use this identifier to cite or link to this item:
http://hdl.handle.net/10397/107135
Title: | Variational domain adversarial learning with mutual information maximization for speaker verification | Authors: | Tu, Y Mak, MW Chien, JT |
Issue Date: | 2020 | Source: | IEEE/ACM transactions on audio, speech, and language processing, 2020, v. 28, p. 2013-2024 | Abstract: | Domain mismatch is a common problem in speaker verification (SV) and often causes performance degradation. For the system relying on the Gaussian PLDA backend to suppress the channel variability, the performance would be further limited if there is no Gaussianity constraint on the learned embeddings. This paper proposes an information-maximized variational domain adversarial neural network (InfoVDANN) that incorporates an InfoVAE into domain adversarial training (DAT) to reduce domain mismatch and simultaneously meet the Gaussianity requirement of the PLDA backend. Specifically, DAT is applied to produce speaker discriminative and domain-invariant features, while the InfoVAE performs variational regularization on the embedded features so that they follow a Gaussian distribution. Another benefit of the InfoVAE is that it avoids posterior collapse in VAEs by preserving the mutual information between the embedded features and the training set so that extra speaker information can be retained in the features. Experiments on both SRE16 and SRE18-CMN2 show that the InfoVDANN outperforms the recent VDANN, which suggests that increasing the mutual information between the embedded features and input features enables the InfoVDANN to extract extra speaker information that is otherwise not possible. | Keywords: | Domain adaptation Domain adversarial training Mutual information Speaker verification (SV) Variational autoencoder |
Publisher: | Institute of Electrical and Electronics Engineers | Journal: | IEEE/ACM transactions on audio, speech, and language processing | ISSN: | 2329-9290 | EISSN: | 2329-9304 | DOI: | 10.1109/TASLP.2020.3004760 | Rights: | © 2020 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. The following publication Y. Tu, M. -W. Mak and J. -T. Chien, "Variational Domain Adversarial Learning With Mutual Information Maximization for Speaker Verification," in IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 28, pp. 2013-2024, 2020 is available at https://doi.org/10.1109/TASLP.2020.3004760. |
Appears in Collections: | Journal/Magazine Article |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Lin_Variational_Domain_Adversarial.pdf | Pre-Published version | 1.73 MB | Adobe PDF | View/Open |
Page views
5
Citations as of Jun 30, 2024
Downloads
5
Citations as of Jun 30, 2024
SCOPUSTM
Citations
31
Citations as of Jun 21, 2024
WEB OF SCIENCETM
Citations
25
Citations as of Jun 27, 2024
Google ScholarTM
Check
Altmetric
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.