High-level speaker verification via articulatory-feature based sequence kernels and SVM

Zhang, SX; Mak, MW

doi:10.21437/interspeech.2008-404

Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/114613

DC Field	Value	Language
dc.contributor	Department of Electrical and Electronic Engineering	-
dc.creator	Zhang, SX	-
dc.creator	Mak, MW	-
dc.date.accessioned	2025-08-18T03:02:15Z	-
dc.date.available	2025-08-18T03:02:15Z	-
dc.identifier.uri	http://hdl.handle.net/10397/114613	-
dc.description	Interspeech 2008, Brisbane, Australia, 22-26 September 2008	en_US
dc.language.iso	en	en_US
dc.publisher	International Speech Communication Association	en_US
dc.rights	Copyright © 2008 ISCA	en_US
dc.rights	The following publication Zhang, S.-X., Mak, M.-W. (2008) High-level speaker verification via articulatory-feature based sequence kernels and SVM. Proc. Interspeech 2008, 1393-1396 is available at https://doi.org/10.21437/Interspeech.2008-404.	en_US
dc.title	High-level speaker verification via articulatory-feature based sequence kernels and SVM	en_US
dc.type	Conference Paper	en_US
dc.identifier.spage	1393	-
dc.identifier.epage	1396	-
dc.identifier.doi	10.21437/interspeech.2008-404	-
dcterms.abstract	Articulatory-feature based pronunciation models (AFCPMs) are capable of capturing the pronunciation variations among different speakers and are good for high-level speaker recognition. However, the likelihood-ratio scoring method of AFPCMs is based on a decision boundary created by training the target speaker model and universal background model (UBM) separately. Therefore, the method does not fully utilize the discriminative information available in the training data. To fully harness the discriminative information, this paper proposes training a support vector machine (SVM) for computing the verification scores. More precisely, the models of target speakers, individual background speakers, and claimants are converted to AF-supervectors, which form the inputs to an AF-based kernel of the SVM for computing verification scores. Results show that the proposed AF-kernel scoring is complementary to likelihood-ratio scoring, leading to better performance when the two scoring methods are combined. Further performance enhancement was also observed when the AF scores were combined with acoustic scores derived from a GMM-UBM system.	-
dcterms.accessRights	open access	en_US
dcterms.bibliographicCitation	Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2008, p. 1393-1396	-
dcterms.issued	2008	-
dc.identifier.scopus	2-s2.0-84867219243	-
dc.description.validate	202508 bcch	-
dc.description.oa	Version of Record	en_US
dc.identifier.FolderNumber	OA_Others	en_US
dc.description.fundingSource	RGC	en_US
dc.description.fundingSource	Others	en_US
dc.description.fundingText	HKPolyU Project No. A-PA6F	en_US
dc.description.pubStatus	Published	en_US
dc.description.oaCategory	VoR allowed	en_US
Appears in Collections:	Conference Paper

Files in This Item:

File	Description	Size	Format
zhang08d_interspeech.pdf		518.27 kB	Adobe PDF	View/Open

Open Access Information

Status	open access
File Version	Version of Record

Access

View full-text via PolyU eLinks

Show simple item record

SCOPUS^TM
Citations

5

Citations as of Apr 3, 2026

Google Scholar^TM

Check

Files in This Item:

Open Access Information

Access

SCOPUSTM Citations

Google ScholarTM

Altmetric

SCOPUS^TM
Citations

Google Scholar^TM