Please use this identifier to cite or link to this item:
http://hdl.handle.net/10397/106995
DC Field | Value | Language |
---|---|---|
dc.contributor | Department of Electrical and Electronic Engineering | en_US |
dc.creator | Tan, Z | en_US |
dc.creator | Mak, MW | en_US |
dc.date.accessioned | 2024-06-07T00:59:30Z | - |
dc.date.available | 2024-06-07T00:59:30Z | - |
dc.identifier.isbn | 978-1-5108-4876-4 | en_US |
dc.identifier.uri | http://hdl.handle.net/10397/106995 | - |
dc.description | 18th Annual Conference of the International Speech Communication Association, INTERSPEECH 2017, Stockholm, Sweden, 20-24 August 2017 | en_US |
dc.language.iso | en | en_US |
dc.publisher | International Speech Communication Association (ISCA) | en_US |
dc.rights | Copyright © 2017 ISCA | en_US |
dc.rights | The following publication Tan, Z., Mak, M.-W. (2017) i-Vector DNN Scoring and Calibration for Noise Robust Speaker Verification. Proc. Interspeech 2017, 1562-1566 is available at https://doi.org/10.21437/Interspeech.2017-656. | en_US |
dc.title | I-Vector DNN scoring and calibration for noise robust speaker verification | en_US |
dc.type | Conference Paper | en_US |
dc.identifier.spage | 1562 | en_US |
dc.identifier.epage | 1566 | en_US |
dc.identifier.doi | 10.21437/Interspeech.2017-656 | en_US |
dcterms.abstract | This paper proposes applying multi-task learning to train deep neural networks (DNNs) for calibrating the PLDA scores of speaker verification systems under noisy environments. To facilitate the DNNs to learn the main task (calibration), several auxiliary tasks were introduced, including the prediction of SNR and duration from i-vectors and classifying whether an i-vector pair belongs to the same speaker or not. The possibility of replacing the PLDA model by a DNN during the scoring stage is also explored. Evaluations on noise contaminated speech suggest that the auxiliary tasks are important for the DNNs to learn the main calibration task and that the uncalibrated PLDA scores are an essential input to the DNNs. Without this input, the DNNs can only predict the score shifts accurately, suggesting that the PLDA model is indispensable. | en_US |
dcterms.accessRights | open access | en_US |
dcterms.bibliographicCitation | Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, Stockholm, Sweden, 20-24 August 2017, p. 1562-1566 | en_US |
dcterms.issued | 2017 | - |
dc.identifier.scopus | 2-s2.0-85039167490 | - |
dc.relation.conference | International Speech Communication Association [Interspeech] | en_US |
dc.description.validate | 202405 bcch | en_US |
dc.description.oa | Version of Record | en_US |
dc.identifier.FolderNumber | EIE-0773 | - |
dc.description.fundingSource | RGC | en_US |
dc.description.pubStatus | Published | en_US |
dc.identifier.OPUS | 6912605 | - |
dc.description.oaCategory | VoR allowed | en_US |
Appears in Collections: | Conference Paper |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
tan17_interspeech.pdf | 1.25 MB | Adobe PDF | View/Open |
Page views
83
Last Week
4
4
Last month
Citations as of Sep 21, 2025
Downloads
29
Citations as of Sep 21, 2025
SCOPUSTM
Citations
3
Citations as of Sep 26, 2025
WEB OF SCIENCETM
Citations
2
Citations as of Sep 25, 2025

Google ScholarTM
Check
Altmetric
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.