Back to results list
Please use this identifier to cite or link to this item:
|Title:||Speaker verification based on probabilistic neural networks with a priori decision thresholds||Authors:||Yiu, Kwok-kwong Michael||Keywords:||Speech processing systems
Automatic speech recognition
Neural networks (Computer science)
Hong Kong Polytechnic University -- Dissertations
|Issue Date:||2000||Publisher:||The Hong Kong Polytechnic University||Abstract:||Speaker verification is to verify the identity of a speaker based on his or her own voice. Typically, a speaker verification system requires one or more decision thresholds for making verification decisions: accepting the users and rejecting impostors. For the purpose of comparing the performance of different systems, researchers usually adjust the thresholds during verification in order to equalise the false acceptance rate and the false rejection rate. However, in real-world environment, the thresholds should be determined prior to verification. In conventional approaches to speaker verification, a speaker model is constructed for each user, followed by a threshold determination procedure. While this two-step approach has been successful in many situations, it does not account for the interaction between the speaker models and the decision thresholds. In this dissertation, we integrate the speaker model construction and threshold determination procedures in a single framework by using probabilistic decision-based neural networks (PDBNNs). A PDBNN can be considered as a Gaussian mixture model (GMM) with trainable decision thresholds. GMMs have been widely used as speaker models because of their capability to model arbitrary density functions. However, GMMs have limitations as they do not provide a proper mechanism for setting decision thresholds. By using the thresholding mechanism of PDBNNs, this dissertation aims to improve the robustness of speaker verification systems against intruder attacks.
This dissertation begins with detailed illustrations to compare the decision boundaries of PDBNNs with that of GMMs. The comparison is based on two pattern recognition tasks, namely the noisy XOR problem and the classification of two-dimensional vowel data. Experimental results show that the thresholding mechanism of PDBNNs is very effective in detecting data not belonging to any known classes. Based on this finding, the dissertation explains how the networks can be extended to speaker verification. Experimental evaluations based on 138 speakers of the YOHO corpus have been conducted. It is found that the error rate obtained by the PDBNNs is about half of that of Higgins et al. (a benchmark error rate fot the YOHO corpus), suggesting that the discriminative training procedure of PDBNNs is able to improve the robustness of the speaker models. It is also found that the discriminative training procedure of PDBNNs is able to embed the background speakers characteristics in the speaker models, resulting in a substantial saving in computational resources during verification. This work has also explored various channel compensation techniques for speaker verification over the public telephone network. A new channel compensation approach, which is based on the measurement of telephone handsets' frequency responses, is proposed. The capability of various channel compensation methods, such as cepstral mean subtraction and signal bias removal, in reducing channel distortion is compared with that of the proposed approach. Results show that the proposed approach outperforms the conventional cepstral mean subtraction but is slightly inferior to signal bias removal.
|Description:||ix, 115 leaves : ill. ; 30 cm
PolyU Library Call No.: [THS] LG51 .H577M EIE 2000 Yiu
|URI:||http://hdl.handle.net/10397/3859||Rights:||All rights reserved.|
|Appears in Collections:||Thesis|
Show full item record
Files in This Item:
|b15353898_link.htm||For PolyU Users||162 B||HTML||View/Open|
|b15353898_ir.pdf||For All Users (Non-printable)||3.58 MB||Adobe PDF||View/Open|
Citations as of Sep 18, 2018
Citations as of Sep 18, 2018
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.