An investigation of few-shot learning in spoken term classification

Chen, Y; Ko, T; Shang, L; Chen, X; Jiang, X; Li, Q

doi:10.21437/Interspeech.2020-2568

Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/105514

DC Field	Value	Language
dc.contributor	Department of Computing	-
dc.creator	Chen, Y	-
dc.creator	Ko, T	-
dc.creator	Shang, L	-
dc.creator	Chen, X	-
dc.creator	Jiang, X	-
dc.creator	Li, Q	-
dc.date.accessioned	2024-04-15T07:34:48Z	-
dc.date.available	2024-04-15T07:34:48Z	-
dc.identifier.uri	http://hdl.handle.net/10397/105514	-
dc.language.iso	en	en_US
dc.publisher	International Speech Communication Association	en_US
dc.rights	Copyright © 2020 ISCA	en_US
dc.rights	The following publication Chen, Y., Ko, T., Shang, L., Chen, X., Jiang, X., Li, Q. (2020) An Investigation of Few-Shot Learning in Spoken Term Classification. Proc. Interspeech 2020, 2582-2586, doi: 10.21437/Interspeech.2020-2568 is available at https://www.isca-speech.org/archive/interspeech_2020/chen20j_interspeech.html.	en_US
dc.subject	Convolutional neural network	en_US
dc.subject	Few-shot classification	en_US
dc.subject	Meta learning	en_US
dc.subject	Spoken term classification	en_US
dc.title	An investigation of few-shot learning in spoken term classification	en_US
dc.type	Conference Paper	en_US
dc.identifier.spage	2582	-
dc.identifier.epage	2586	-
dc.identifier.doi	10.21437/Interspeech.2020-2568	-
dcterms.abstract	In this paper, we investigate the feasibility of applying few-shot learning algorithms to a speech task. We formulate a user-defined scenario of spoken term classification as a few-shot learning problem. In most few-shot learning studies, it is assumed that all the N classes are new in a N-way problem. We suggest that this assumption can be relaxed and define a N+M-way problem where N and M are the number of new classes and fixed classes respectively. We propose a modification to the Model-Agnostic Meta-Learning (MAML) algorithm to solve the problem. Experiments on the Google Speech Commands dataset show that our approach1 outperforms the conventional supervised learning approach and the original MAML.	-
dcterms.accessRights	open access	en_US
dcterms.bibliographicCitation	Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2020, 25-29 October 2020, Shanghai, China, p. 2582-2586	-
dcterms.issued	2020	-
dc.identifier.scopus	2-s2.0-85098174751	-
dc.relation.conference	Conference of the International Speech Communication Association [INTERSPEECH],	-
dc.description.validate	202402 bcch	-
dc.description.oa	Version of Record	en_US
dc.identifier.FolderNumber	COMP-0206	en_US
dc.description.fundingSource	Self-funded	en_US
dc.description.pubStatus	Published	en_US
dc.identifier.OPUS	49985484	en_US
dc.description.oaCategory	VoR allowed	en_US
Appears in Collections:	Conference Paper

Files in This Item:

File	Description	Size	Format
chen20j_interspeech.pdf		396.93 kB	Adobe PDF	View/Open

Open Access Information

Status	open access
File Version	Version of Record

Access

View full-text via PolyU eLinks

Show simple item record

Page views

139

Last Week
6

Last month

Citations as of Nov 9, 2025

Downloads

56

Citations as of Nov 9, 2025

SCOPUS^TM
Citations

15

Citations as of Dec 19, 2025

WEB OF SCIENCE^TM
Citations

8

Citations as of Dec 18, 2025

Google Scholar^TM

Check

Files in This Item:

Open Access Information

Access

Page views

Downloads

SCOPUSTM Citations

WEB OF SCIENCETM Citations

Google ScholarTM

Altmetric

SCOPUS^TM
Citations

WEB OF SCIENCE^TM
Citations

Google Scholar^TM