Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/102666
PIRA download icon_1.1View/Download Full Text
DC FieldValueLanguage
dc.contributorDepartment of Rehabilitation Sciencesen_US
dc.contributorDepartment of Computingen_US
dc.contributorSchool of Optometryen_US
dc.creatorYang, Cen_US
dc.creatorWang, Ken_US
dc.creatorChen, PQen_US
dc.creatorCheung, MKMen_US
dc.creatorZhang, Yen_US
dc.creatorFu, EYen_US
dc.creatorNgai, Gen_US
dc.date.accessioned2023-11-06T05:52:15Z-
dc.date.available2023-11-06T05:52:15Z-
dc.identifier.isbn979-8-4007-0108-5en_US
dc.identifier.urihttp://hdl.handle.net/10397/102666-
dc.description31st ACM International Conference on Multimedia, Ottawa ON Canada, 29 October 2023 - 3 November 2023en_US
dc.language.isoenen_US
dc.publisherAssociation for Computing Machineryen_US
dc.rights© 2023 Copyright held by the owner/author(s). Publication rights licensed to ACM. This is the author's version of the work. It is posted here for your personal use. Not for redistribution. The definitive Version of Record was published in MM '23: Proceedings of the 31st ACM International Conference on Multimedia, https://doi.org/10.1145/3581783.3612873.en_US
dc.subjectEngagementen_US
dc.subjectMachine learningen_US
dc.subjectNeural networksen_US
dc.titleMultiMediate 2023 : engagement level detection using audio and video featuresen_US
dc.typeConference Paperen_US
dc.identifier.spage9601en_US
dc.identifier.epage9605en_US
dc.identifier.doi10.1145/3581783.3612873en_US
dcterms.abstractReal-time engagement estimation holds significant potential across various research areas, particularly in the realm of human-computer interaction. It empowers artificial agents to dynamically adjust their responses based on user engagement levels, fostering more intuitive and immersive interactions. Despite the strides in automating real-time engagement estimation, the task remains challenging in real-world settings, especially when handling multi-modal human social signals. Capitalizing on human body and audio signals, this paper explores the appropriate feature representations of different modalities and effective modelling of dual conversations. This results in a novel and efficient multi-modal engagement detection model.We thoroughly evaluated our method in the MultiMediate'23 grand challenge. It performs consistently, with a notable improvement over the baseline model. Specifically, while the baseline achieves a concordance correlation coefficient (CCC) of 0.59, our approach yields a CCC of 0.70, suggesting its promising efficacy in real-life engagement detection.en_US
dcterms.accessRightsopen accessen_US
dcterms.bibliographicCitationIn MM '23: Proceedings of the 31st ACM International Conference on Multimedia, p. 9601-9605. New York, NY: Association for Computing Machinery, 2023en_US
dcterms.issued2023-
dc.relation.ispartofbookMM '23: Proceedings of the 31st ACM International Conference on Multimediaen_US
dc.relation.conferenceACM International Conference on Multimedia [MM]en_US
dc.description.validate202311 bcchen_US
dc.description.oaAccepted Manuscripten_US
dc.identifier.FolderNumbera2504-
dc.identifier.SubFormID47795-
dc.description.fundingSourceRGCen_US
dc.description.pubStatusPublisheden_US
dc.description.oaCategoryGreen (AAM)en_US
Appears in Collections:Conference Paper
Files in This Item:
File Description SizeFormat 
Yang_MultiMediate_Engagement_Level.pdfPre-Published version1.18 MBAdobe PDFView/Open
Open Access Information
Status open access
File Version Final Accepted Manuscript
Access
View full-text via PolyU eLinks SFX Query
Show simple item record

Page views

141
Citations as of Apr 14, 2025

Downloads

146
Citations as of Apr 14, 2025

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.