Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/43678
PIRA download icon_1.1View/Download Full Text
DC FieldValueLanguage
dc.contributorDepartment of Electronic and Information Engineeringen_US
dc.creatorPang, Xen_US
dc.creatorMak, MWen_US
dc.date.accessioned2016-06-07T06:22:55Z-
dc.date.available2016-06-07T06:22:55Z-
dc.identifier.issn1381-2416en_US
dc.identifier.urihttp://hdl.handle.net/10397/43678-
dc.language.isoenen_US
dc.publisherSpringeren_US
dc.rights© Springer Science+Business Media New York 2015en_US
dc.rightsThis version of the article has been accepted for publication, after peer review (when applicable) and is subject to Springer Nature’s AM terms of use(https://www.springernature.com/gp/open-research/policies/accepted-manuscript-terms), but is not the Version of Record and does not reflect post-acceptance improvements, or any corrections. The Version of Record is available online at: http://dx.doi.org/10.1007/s10772-015-9310-8en_US
dc.subjectFusionen_US
dc.subjectI-vectorsen_US
dc.subjectNIST 2012 SREen_US
dc.subjectNoise robustnessen_US
dc.subjectProbabilistic LDAen_US
dc.subjectSpeaker verificationen_US
dc.titleNoise robust speaker verification via the fusion of SNR-independent and SNR-dependent PLDAen_US
dc.typeJournal/Magazine Articleen_US
dc.identifier.spage633en_US
dc.identifier.epage648en_US
dc.identifier.volume18en_US
dc.identifier.issue4en_US
dc.identifier.doi10.1007/s10772-015-9310-8en_US
dcterms.abstractWhile i-vectors with probabilistic linear discriminant analysis (PLDA) can achieve state-of-the-art performance in speaker verification, the mismatch caused by acoustic noise remains a key factor affecting system performance. In this paper, a fusion system that combines a multi-condition signal-to-noise ratio (SNR)-independent PLDA model and a mixture of SNR-dependent PLDA models is proposed to make speaker verification systems more noise robust. First, the whole range of SNR that a verification system is expected to operate is divided into several narrow ranges. Then, a set of SNR-dependent PLDA models, one for each narrow SNR range, are trained. During verification, the SNR of the test utterance is used to determine which of the SNR-dependent PLDA models is used for scoring. To further enhance performance, the SNR-dependent and SNR-independent models are fused using linear and logistic regression fusion. The performance of the fusion system and the SNR-dependent system is evaluated on the NIST 2012 speaker recognition evaluation for both noisy and clean conditions. Results show that a mixture of SNR-dependent PLDA models perform better in both clean and noisy conditions. It was also found that the fusion system is more robust than the conventional i-vector/PLDA systems under noisy conditions.en_US
dcterms.accessRightsopen accessen_US
dcterms.bibliographicCitationInternational journal of speech technology, Dec. 2015, v. 18, no. 4, p. 633-648en_US
dcterms.isPartOfInternational journal of speech technologyen_US
dcterms.issued2015-12-
dc.identifier.scopus2-s2.0-84947485041-
dc.identifier.rosgroupid2015002456-
dc.description.ros2015-2016 > Academic research: refereed > Publication in refereed journalen_US
dc.description.oaAccepted Manuscripten_US
dc.identifier.FolderNumberRGC-B3-0969-
dc.description.fundingSourceRGCen_US
dc.description.pubStatusPublisheden_US
Appears in Collections:Journal/Magazine Article
Files in This Item:
File Description SizeFormat 
Noise_Robust_Speaker.pdfPre-Published version1.44 MBAdobe PDFView/Open
Open Access Information
Status open access
File Version Final Accepted Manuscript
Access
View full-text via PolyU eLinks SFX Query
Show simple item record

Page views

88
Last Week
0
Last month
4
Citations as of Mar 24, 2024

Downloads

30
Citations as of Mar 24, 2024

SCOPUSTM   
Citations

1
Last Week
0
Last month
Citations as of Mar 28, 2024

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.