Please use this identifier to cite or link to this item:
Title: Blind stochastic feature transformation for speaker verification over cellular networks
Authors: Yiu, KK
Mak, MW 
Cheung, MC
Kung, SY
Issue Date: 2004
Source: Proceedings of the International Symposium on Intelligent Multimedia, Video and Speech Processing (ISIMP’2004), Hong Kong, 20-22 Oct. 2004, p. 679-682
Abstract: Acoustic mismatch between the training and recognition conditions presents one of the serious challenges faced by speaker recognition researchers today. The goal of channel compensation is to achieve performance approaching that of a "matched condition" system while avoiding the need for a large amount of training data. It is important to ensure that the channel compensation algorithms in these systems compensate the channel variation instead of speaker variation. This paper addresses the problem of unsupervised compensation in which the features of a test utterance are transformed to fit the clean speaker model and gender-dependent background model. Specifically, a feature-based transformation is estimated based on the statistical difference between a test utterance and a composite acoustic model formed by combining the speaker and background models. By transforming the features to fit both models, the transformation is implicitly constrained. Experimental results based on the 2001 NIST evaluation set show that the proposed transformation approach achieves significant improvement in both equal error rate and minimum detection cost as compared to cepstral mean subtraction, Znorm and short-time Gaussianization.
Keywords: Cellular radio
Speaker recognition
Statistical analysis
Stochastic processes
ISBN: 0-7803-8687-6
DOI: 10.1109/ISIMP.2004.1434155
Appears in Collections:Conference Paper

View full-text via PolyU eLinks SFX Query
Show full item record

Page view(s)

Last Week
Last month
Citations as of Sep 16, 2020

Google ScholarTM



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.