Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/76008
Title: Discriminative subspace modeling of SNR and duration variabilities for robust speaker verification
Authors: Li, N 
Mak, MW 
Lin, WW 
Chien, JT
Keywords: Speaker verification
Duration variation
SNR mismatch
Variational Bayes
I-vector
PLDA
Issue Date: 2017
Publisher: Academic Press
Source: Computer speech and language, 2017, v. 45, p. 83-103 How to cite?
Journal: Computer speech and language 
Abstract: Although i-vectors together with probabilistic LDA (PLDA) have achieved a great success in speaker verification, how to suppress the undesirable effects caused by the variability in utterance length and background noise level is still a challenge. This paper aims to improve the robustness of i-vector based speaker verification systems by compensating for the utterance-length variability and noise-level variability. Inspired by the recent findings that noise-level variability can be modeled by a signal-to-noise ratio (SNR) subspace and that duration variability can be modeled as additive noise in the i-vector space, we propose to add an SNR factor and a duration factor to the PLDA model. In this framework, we assume that i-vectors derived from utterances with comparable durations share similar duration-specific information and that i-vectors extracted from utterances within. a narrow SNR range have similar SNR-specific information. Based on these assumptions, an i-vector can be represented as a linear combination of four components: speaker, SNR, duration, and channel. A variational Bayes algorithm is developed to infer this latent variable model via a discriminative subspace training procedure. In the testing stage, different variabilities are compensated for when computing the likelihood ratio. Experiments on Common Conditions 1 and 4 in MST 2012 SRE show that the proposed model outperforms the conventional PLDA and SNR-invariant PLDA. Results also show that the proposed model performs better than the uncertainty-propagation PLDA (UP-PLDA) for long test utterances.
URI: http://hdl.handle.net/10397/76008
ISSN: 0885-2308
EISSN: 1095-8363
DOI: 10.1016/j.cs1.2017.04.001
Appears in Collections:Journal/Magazine Article

Access
View full-text via PolyU eLinks SFX Query
Show full item record

SCOPUSTM   
Citations

1
Citations as of May 11, 2018

WEB OF SCIENCETM
Citations

2
Last Week
0
Last month
Citations as of May 20, 2018

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.