Please use this identifier to cite or link to this item:
http://hdl.handle.net/10397/116648
| DC Field | Value | Language |
|---|---|---|
| dc.contributor | Department of Computing | en_US |
| dc.creator | Hong, M | en_US |
| dc.creator | Zhang, CJ | en_US |
| dc.creator | Yang, L | en_US |
| dc.creator | Song, Y | en_US |
| dc.creator | Jiang, D | en_US |
| dc.date.accessioned | 2026-01-09T03:07:29Z | - |
| dc.date.available | 2026-01-09T03:07:29Z | - |
| dc.identifier.uri | http://hdl.handle.net/10397/116648 | - |
| dc.description | The 16th Asian Conference on Machine Learning, December 5-8, 2024, in Hanoi, Vietnam | en_US |
| dc.language.iso | en | en_US |
| dc.publisher | PMLR web site | en_US |
| dc.rights | © 2024 M. Hong, C.J. Zhang, L. Yang, Y. Song & D. Jiang. | en_US |
| dc.rights | Posted with permission of the author. | en_US |
| dc.rights | The following publication Hong, M., Zhang, C.J., Yang, L., SONG, Y. & Jiang, D.. (2025). InfantCryNet: A Data-driven Framework for Intelligent Analysis of Infant Cries. Proceedings of the 16th Asian Conference on Machine Learning, in Proceedings of Machine Learning Research 260:845-857 is available at https://proceedings.mlr.press/v260/hong25a.html. | en_US |
| dc.subject | Convolutional neural networks | en_US |
| dc.subject | Infant cry classification | en_US |
| dc.subject | Model compression | en_US |
| dc.title | InfantCryNet : a data-driven framework for intelligent analysis of infant cries | en_US |
| dc.type | Conference Paper | en_US |
| dc.identifier.spage | 845 | en_US |
| dc.identifier.epage | 857 | en_US |
| dc.identifier.volume | 260 | en_US |
| dcterms.abstract | Understanding the meaning of infant cries is a significant challenge for young parents in caring for their newborns. The presence of background noise and the lack of labeled data present practical challenges in developing systems that can detect crying and analyze its underlying reasons. In this paper, we present a novel data-driven framework, “InfantCryNet,” for accomplishing these tasks. To address the issue of data scarcity, we employ pre-trained audio models to incorporate prior knowledge into our model. We propose the use of statistical pooling and multi-head attention pooling techniques to extract features more effectively. Additionally, knowledge distillation and model quantization are applied to enhance model efficiency and reduce the model size, better supporting industrial deployment in mobile devices. Experiments on real-life datasets demonstrate the superior performance of the proposed framework, outperforming state-of-the-art baselines by 4.4% in classification accuracy. The model compression effectively reduces the model size by 7% without compromising performance and by up to 28% with only an 8% decrease in accuracy, offering practical insights for model selection and system design. | en_US |
| dcterms.accessRights | open access | en_US |
| dcterms.bibliographicCitation | Proceedings of the 16th Asian Conference on Machine Learning, 2024, v. 260, p. 845-857 | en_US |
| dcterms.issued | 2024 | - |
| dc.relation.conference | Asian Conference on Machine Learning [ACML] | en_US |
| dc.description.validate | 202601 bcch | en_US |
| dc.description.oa | Version of Record | en_US |
| dc.identifier.FolderNumber | a4255a | - |
| dc.identifier.SubFormID | 52470 | - |
| dc.description.fundingSource | RGC | en_US |
| dc.description.pubStatus | Published | en_US |
| dc.description.oaCategory | Copyright retained by author | en_US |
| Appears in Collections: | Conference Paper | |
Files in This Item:
| File | Description | Size | Format | |
|---|---|---|---|---|
| hong25a.pdf | 2.99 MB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.


