SaccpaNet : a separable atrous convolution-based cascade pyramid attention network to estimate body landmarks using cross-modal knowledge transfer for under-blanket sleep posture classification

Tam, AYC; Mao, Y; Lai, DKH; Chan, ACH; Cheung, DSK; Kearns, WD; Wong, DWC; Cheung, JCW

doi:10.1109/JBHI.2024.3432195

Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/115296

Title:	SaccpaNet : a separable atrous convolution-based cascade pyramid attention network to estimate body landmarks using cross-modal knowledge transfer for under-blanket sleep posture classification
Authors:	Tam, AYC Mao, Y Lai, DKH Chan, ACH Cheung, DSK Kearns, WD Wong, DWC Cheung, JCW
Issue Date:	2024
Source:	IEEE journal of biomedical and health informatics, Date of Publication: 23 July 2024, Early Access, https://dx.doi.org/10.1109/JBHI.2024.3432195
Abstract:	The accuracy of sleep posture assessment in standard polysomnography might be compromised by the unfamiliar sleep lab environment. In this work, we aim to develop a depth camera-based sleep posture monitoring and classification system for home or community usage and tailor a deep learning model that can account for blanket interference. Our model included a joint coordinate estimation network (JCE) and sleep posture classification network (SPC). SaccpaNet (Separable Atrous Convolution-based Cascade Pyramid Attention Network) was developed using a combination of pyramidal structure of residual separable atrous convolution unit to reduce computational cost and enlarge receptive field. The Saccpa attention unit served as the core of JCE and SPC, while different backbones for SPC were also evaluated. The model was cross-modally pretrained by RGB images from the COCO whole body dataset and then trained/tested using dept image data collected from 150 participants performing seven sleep postures across four blanket conditions. Besides, we applied a data augmentation technique that used intra-class mix-up to synthesize blanket conditions; and an overlaid flip-cut to synthesize partially covered blanket conditions for a robustness that we referred to as the Post-hoc Data Augmentation Robustness Test (PhD-ART). Our model achieved an average precision of estimated joint coordinate (in terms of PCK@0.1) of 0.652 and demonstrated adequate robustness. The overall classification accuracy of sleep postures (F1-score) was 0.885 and 0.940, for 7- and 6-class classification, respectively. Our system was resistant to the interference of blanket, with a spread difference of 2.5%.
Keywords:	Convolution Accuracy Feature extraction Data models Robustness Cameras Kernel Computer vision Deep learning Human activity recognition Image classification Sleep
Publisher:	Institute of Electrical and Electronics Engineers
Journal:	IEEE journal of biomedical and health informatics
ISSN:	2168-2194
EISSN:	2168-2208
DOI:	10.1109/JBHI.2024.3432195
Rights:	This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License (https://creativecommons.org/licenses/by-nc-nd/4.0/). The following publication Tam, A. Y. C., Mao, Y. J., Lai, D. K. H., Chan, A. C. H., Cheung, D. S. K., Kearns, W., ... & Cheung, J. C. W. (2024). SaccpaNet: A Separable Atrous Convolution-based Cascade Pyramid Attention Network to Estimate Body Landmarks Using Cross-modal Knowledge Transfer for Under-blanket Sleep Posture Classification. IEEE journal of biomedical and health informatics, 1-12 is available at https://doi.org/10.1109/JBHI.2024.3432195.
Appears in Collections:	Journal/Magazine Article

Files in This Item:

File	Description	Size	Format
Tam_SaccpaNet_Separable_Atrous.pdf		1.51 MB	Adobe PDF	View/Open

Open Access Information

Status	open access
File Version	Version of Record

Access

View full-text via PolyU eLinks

Show full item record

SCOPUS^TM
Citations

2

Citations as of Nov 21, 2025

Google Scholar^TM

Check

Files in This Item:

Open Access Information

Access

SCOPUSTM Citations

Google ScholarTM

Altmetric

SCOPUS^TM
Citations

Google Scholar^TM