Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/3859
Title: Speaker verification based on probabilistic neural networks with a priori decision thresholds
Authors: Yiu, Kwok-kwong Michael
Keywords: Speech processing systems
Automatic speech recognition
Neural networks (Computer science)
Hong Kong Polytechnic University -- Dissertations
Issue Date: 2000
Publisher: The Hong Kong Polytechnic University
Abstract: Speaker verification is to verify the identity of a speaker based on his or her own voice. Typically, a speaker verification system requires one or more decision thresholds for making verification decisions: accepting the users and rejecting impostors. For the purpose of comparing the performance of different systems, researchers usually adjust the thresholds during verification in order to equalise the false acceptance rate and the false rejection rate. However, in real-world environment, the thresholds should be determined prior to verification. In conventional approaches to speaker verification, a speaker model is constructed for each user, followed by a threshold determination procedure. While this two-step approach has been successful in many situations, it does not account for the interaction between the speaker models and the decision thresholds. In this dissertation, we integrate the speaker model construction and threshold determination procedures in a single framework by using probabilistic decision-based neural networks (PDBNNs). A PDBNN can be considered as a Gaussian mixture model (GMM) with trainable decision thresholds. GMMs have been widely used as speaker models because of their capability to model arbitrary density functions. However, GMMs have limitations as they do not provide a proper mechanism for setting decision thresholds. By using the thresholding mechanism of PDBNNs, this dissertation aims to improve the robustness of speaker verification systems against intruder attacks.
This dissertation begins with detailed illustrations to compare the decision boundaries of PDBNNs with that of GMMs. The comparison is based on two pattern recognition tasks, namely the noisy XOR problem and the classification of two-dimensional vowel data. Experimental results show that the thresholding mechanism of PDBNNs is very effective in detecting data not belonging to any known classes. Based on this finding, the dissertation explains how the networks can be extended to speaker verification. Experimental evaluations based on 138 speakers of the YOHO corpus have been conducted. It is found that the error rate obtained by the PDBNNs is about half of that of Higgins et al. (a benchmark error rate fot the YOHO corpus), suggesting that the discriminative training procedure of PDBNNs is able to improve the robustness of the speaker models. It is also found that the discriminative training procedure of PDBNNs is able to embed the background speakers characteristics in the speaker models, resulting in a substantial saving in computational resources during verification. This work has also explored various channel compensation techniques for speaker verification over the public telephone network. A new channel compensation approach, which is based on the measurement of telephone handsets' frequency responses, is proposed. The capability of various channel compensation methods, such as cepstral mean subtraction and signal bias removal, in reducing channel distortion is compared with that of the proposed approach. Results show that the proposed approach outperforms the conventional cepstral mean subtraction but is slightly inferior to signal bias removal.
Description: ix, 115 leaves : ill. ; 30 cm
PolyU Library Call No.: [THS] LG51 .H577M EIE 2000 Yiu
URI: http://hdl.handle.net/10397/3859
Rights: All rights reserved.
Appears in Collections:Thesis

Files in This Item:
File Description SizeFormat 
b15353898_link.htmFor PolyU Users 162 BHTMLView/Open
b15353898_ir.pdfFor All Users (Non-printable)3.58 MBAdobe PDFView/Open
Show full item record

Page view(s)

375
Last Week
2
Last month
Checked on Mar 19, 2017

Download(s)

170
Checked on Mar 19, 2017

Google ScholarTM

Check



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.