Optimization of discriminative kernels in SVM speaker verification

Zhang, SX; Mak, MW

doi:10.21437/interspeech.2009-380

Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/114612

DC Field	Value	Language
dc.contributor	Department of Electrical and Electronic Engineering	-
dc.creator	Zhang, SX	-
dc.creator	Mak, MW	-
dc.date.accessioned	2025-08-18T03:02:14Z	-
dc.date.available	2025-08-18T03:02:14Z	-
dc.identifier.uri	http://hdl.handle.net/10397/114612	-
dc.description	Interspeech 2009, Brighton, United Kingdom, 6-10 September 2009	en_US
dc.language.iso	en	en_US
dc.publisher	International Speech Communication Association	en_US
dc.rights	Copyright © 2009 ISCA	en_US
dc.rights	The following publication Zhang, S.-X., Mak, M.-W. (2009) Optimization of discriminative kernels in SVM speaker verification. Proc. Interspeech 2009, 1275-1278 is available at https://doi.org/10.21437/Interspeech.2009-380.	en_US
dc.subject	High-level features	en_US
dc.subject	Optimal kernels	en_US
dc.subject	Sequence kernels	en_US
dc.subject	Speaker verification	en_US
dc.subject	SVM	en_US
dc.title	Optimization of discriminative kernels in SVM speaker verification	en_US
dc.type	Conference Paper	en_US
dc.identifier.spage	1275	-
dc.identifier.epage	1278	-
dc.identifier.doi	10.21437/interspeech.2009-380	-
dcterms.abstract	An important aspect of SVM-based speaker verification systems is the design of sequence kernels. These kernels should be able to map variable-length observation sequences to fixed-size supervectors that capture the dynamic characteristics of speech utterances and allow speakers to be easily distinguished. Most existing kernels in SVM speaker verification are obtained by assuming a specific form for the similarity function of supervectors. This paper relaxes this assumption to derive a new general kernel. The kernel function is general in that it is a linear combination of any kernels belonging to the reproducing kernel Hilbert space. The combination weights are obtained by optimizing the ability of a discriminant function to separate a target speaker from impostors using either regression analysis or SVM training. The idea was applied to both low- and high-level speaker verification. In both cases, results show that the proposed kernels outperform the state-of-the-art sequence kernels. Further performance enhancement was also observed when the high-level scores were combined with acoustic scores.	-
dcterms.accessRights	open access	en_US
dcterms.bibliographicCitation	Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, 2009, p. 1275-1278	-
dcterms.issued	2009	-
dc.identifier.scopus	2-s2.0-70450175241	-
dc.description.validate	202508 bcch	-
dc.description.oa	Version of Record	en_US
dc.identifier.FolderNumber	OA_Others	en_US
dc.description.fundingSource	RGC	en_US
dc.description.fundingSource	Others	en_US
dc.description.fundingText	Center for Multimedia Signal Processing, The Hong Polytechnic University (1-BB9W)	en_US
dc.description.pubStatus	Published	en_US
dc.description.oaCategory	VoR allowed	en_US
Appears in Collections:	Conference Paper

Files in This Item:

File	Description	Size	Format
zhang09c_interspeech.pdf		1.26 MB	Adobe PDF	View/Open

Open Access Information

Status	open access
File Version	Version of Record

Access

View full-text via PolyU eLinks

Show simple item record

SCOPUS^TM
Citations

4

Citations as of Apr 3, 2026

Google Scholar^TM

Check

Files in This Item:

Open Access Information

Access

SCOPUSTM Citations

Google ScholarTM

Altmetric

SCOPUS^TM
Citations

Google Scholar^TM