Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/76008
Title: Discriminative subspace modeling of SNR and duration variabilities for robust speaker verification
Authors: Li, N 
Mak, MW 
Lin, WW 
Chien, JT
Keywords: Speaker verification
Duration variation
SNR mismatch
Variational Bayes
I-vector
PLDA
Issue Date: 2017
Publisher: Academic Press
Source: Computer speech and language, 2017, v. 45, p. 83-103 How to cite?
Journal: Computer speech and language 
Abstract: Although i-vectors together with probabilistic LDA (PLDA) have achieved a great success in speaker verification, how to suppress the undesirable effects caused by the variability in utterance length and background noise level is still a challenge. This paper aims to improve the robustness of i-vector based speaker verification systems by compensating for the utterance-length variability and noise-level variability. Inspired by the recent findings that noise-level variability can be modeled by a signal-to-noise ratio (SNR) subspace and that duration variability can be modeled as additive noise in the i-vector space, we propose to add an SNR factor and a duration factor to the PLDA model. In this framework, we assume that i-vectors derived from utterances with comparable durations share similar duration-specific information and that i-vectors extracted from utterances within. a narrow SNR range have similar SNR-specific information. Based on these assumptions, an i-vector can be represented as a linear combination of four components: speaker, SNR, duration, and channel. A variational Bayes algorithm is developed to infer this latent variable model via a discriminative subspace training procedure. In the testing stage, different variabilities are compensated for when computing the likelihood ratio. Experiments on Common Conditions 1 and 4 in MST 2012 SRE show that the proposed model outperforms the conventional PLDA and SNR-invariant PLDA. Results also show that the proposed model performs better than the uncertainty-propagation PLDA (UP-PLDA) for long test utterances.
URI: http://hdl.handle.net/10397/76008
ISSN: 0885-2308
EISSN: 1095-8363
DOI: 10.1016/j.cs1.2017.04.001
Appears in Collections:Journal/Magazine Article

Access
View full-text via PolyU eLinks SFX Query
Show full item record

SCOPUSTM   
Citations

2
Last Week
1
Last month
Citations as of Nov 3, 2018

WEB OF SCIENCETM
Citations

3
Last Week
0
Last month
Citations as of Nov 12, 2018

Page view(s)

16
Citations as of Nov 11, 2018

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.