Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/111705
PIRA download icon_1.1View/Download Full Text
DC FieldValueLanguage
dc.contributorDepartment of Chinese and Bilingual Studies-
dc.creatorNg, SI-
dc.creatorNg, CWY-
dc.creatorLee, T-
dc.date.accessioned2025-03-13T02:22:08Z-
dc.date.available2025-03-13T02:22:08Z-
dc.identifier.urihttp://hdl.handle.net/10397/111705-
dc.description24th Annual Conference of the International Speech Communication Association, INTERSPEECH 2023, Dublin, Ireland, August 20-24, 2023en_US
dc.language.isoenen_US
dc.publisherInternational Speech Communication Associationen_US
dc.rightsCopyright © 2023 ISCAen_US
dc.rightsThe following publication Ng, S.-I., Ng, C.W.-Y., Lee, T. (2023) A Study on Using Duration and Formant Features in Automatic Detection of Speech Sound Disorder in Children. Proc. Interspeech 2023, 4643-4647 is available at https://doi.org/10.21437/Interspeech.2023-937.en_US
dc.titleA study on using duration and formant features in automatic detection of speech sound disorder in childrenen_US
dc.typeConference Paperen_US
dc.identifier.spage4643-
dc.identifier.epage4647-
dc.identifier.doi10.21437/Interspeech.2023-937-
dcterms.abstractSpeech sound disorder (SSD) in children is manifested by persistent articulation and phonological errors on specific phonemes of a language. Automatic SSD detection can be done using features extracted from deep neural network models. Interpretability of such learned features is a major concern. Motivated by clinical knowledge, the use of duration and formant features for SSD detection is investigated in this research. Acoustical analysis is performed to identify the acoustic features that differentiate between the speech of typical and disordered children. On the task of SSD detection in Cantonese-speaking children, the duration features are found to outperform the formant features and surpass previous methods that use paralinguistic feature set and speaker embeddings. Specifically, the duration features achieve a mean unweighted average recall of 71.0%. The results enhance the understanding of SSD, and motivate further use of temporal information of child speech in SSD detection.-
dcterms.accessRightsopen accessen_US
dcterms.bibliographicCitationProceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH 2023, p. 4643-4647-
dcterms.issued2023-
dc.identifier.scopus2-s2.0-85171583000-
dc.relation.conferenceConference of the International Speech Communication Association [INTERSPEECH]-
dc.description.validate202503 bcch-
dc.description.oaVersion of Recorden_US
dc.identifier.FolderNumberOA_Othersen_US
dc.description.fundingSourceRGCen_US
dc.description.fundingSourceOthersen_US
dc.description.fundingTextResearch Committee of the Chinese University of Hong Kong; Hear Talk Foundationen_US
dc.description.pubStatusPublisheden_US
dc.description.oaCategoryVoR alloweden_US
Appears in Collections:Conference Paper
Files in This Item:
File Description SizeFormat 
ng23_interspeech.pdf606.95 kBAdobe PDFView/Open
Open Access Information
Status open access
File Version Version of Record
Access
View full-text via PolyU eLinks SFX Query
Show simple item record

Page views

4
Citations as of Apr 14, 2025

Downloads

2
Citations as of Apr 14, 2025

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.