Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/106863
PIRA download icon_1.1View/Download Full Text
DC FieldValueLanguage
dc.contributorDepartment of Electrical and Electronic Engineering-
dc.creatorXu, SS-
dc.creatorKe, X-
dc.creatorMak, MW-
dc.creatorWong, KH-
dc.creatorMeng, H-
dc.creatorKwok, TCY-
dc.creatorGu, J-
dc.creatorZhang, J-
dc.creatorTao, W-
dc.creatorChang, C-
dc.date.accessioned2024-06-06T06:06:03Z-
dc.date.available2024-06-06T06:06:03Z-
dc.identifier.issn1662-4548-
dc.identifier.urihttp://hdl.handle.net/10397/106863-
dc.language.isoenen_US
dc.publisherFrontiers Research Foundationen_US
dc.rights© 2024 Xu, Ke, Mak, Wong, Meng, Kwok, Gu, Zhang, Tao and Chang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY) (http://creativecommons.org/licenses/by/4.0/). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.en_US
dc.rightsThe following publication Xu SS, Ke X, Mak M-W, Wong KH, Meng H, Kwok TCY, Gu J, Zhang J, Tao W and Chang C (2024) Speaker-turn aware diarization for speech-based cognitive assessments. Front. Neurosci. 17:1351848 is available at https://doi.org/10.3389/fnins.2023.1351848.en_US
dc.subjectComprehensive scoringen_US
dc.subjectDementia detectionen_US
dc.subjectMOCAen_US
dc.subjectSpeaker diarizationen_US
dc.subjectSpeaker embeddingen_US
dc.subjectSpeaker-turn timestampsen_US
dc.titleSpeaker-turn aware diarization for speech-based cognitive assessmentsen_US
dc.typeJournal/Magazine Articleen_US
dc.identifier.volume17-
dc.identifier.doi10.3389/fnins.2023.1351848-
dcterms.abstractIntroduction: Speaker diarization is an essential preprocessing step for diagnosing cognitive impairments from speech-based Montreal cognitive assessments (MoCA).-
dcterms.abstractMethods: This paper proposes three enhancements to the conventional speaker diarization methods for such assessments. The enhancements tackle the challenges of diarizing MoCA recordings on two fronts. First, multi-scale channel interdependence speaker embedding is used as the front-end speaker representation for overcoming the acoustic mismatch caused by far-field microphones. Specifically, a squeeze-and-excitation (SE) unit and channel-dependent attention are added to Res2Net blocks for multi-scale feature aggregation. Second, a sequence comparison approach with a holistic view of the whole conversation is applied to measure the similarity of short speech segments in the conversation, which results in a speaker-turn aware scoring matrix for the subsequent clustering step. Third, to further enhance the diarization performance, we propose incorporating a pairwise similarity measure so that the speaker-turn aware scoring matrix contains both local and global information across the segments.-
dcterms.abstractResults: Evaluations on an interactive MoCA dataset show that the proposed enhancements lead to a diarization system that outperforms the conventional x-vector/PLDA systems under language-, age-, and microphone-mismatch scenarios.-
dcterms.abstractDiscussion: The results also show that the proposed enhancements can help hypothesize the speaker-turn timestamps, making the diarization method amendable to datasets without timestamp information.-
dcterms.accessRightsopen accessen_US
dcterms.bibliographicCitationFrontiers in neuroscience, 2023, v. 17, 1351848-
dcterms.isPartOfFrontiers in neuroscience-
dcterms.issued2023-
dc.identifier.scopus2-s2.0-85183596407-
dc.identifier.eissn1662-453X-
dc.identifier.artn1351848-
dc.description.validate202406 bcch-
dc.description.oaVersion of Recorden_US
dc.identifier.FolderNumbera2778en_US
dc.identifier.SubFormID48312en_US
dc.description.fundingSourceRGCen_US
dc.description.pubStatusPublisheden_US
dc.description.oaCategoryCCen_US
Appears in Collections:Journal/Magazine Article
Files in This Item:
File Description SizeFormat 
fnins-17-1351848.pdf1.44 MBAdobe PDFView/Open
Open Access Information
Status open access
File Version Version of Record
Access
View full-text via PolyU eLinks SFX Query
Show simple item record

Page views

6
Citations as of Jun 30, 2024

Downloads

4
Citations as of Jun 30, 2024

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.