Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/115296
PIRA download icon_1.1View/Download Full Text
Title: SaccpaNet : a separable atrous convolution-based cascade pyramid attention network to estimate body landmarks using cross-modal knowledge transfer for under-blanket sleep posture classification
Authors: Tam, AYC 
Mao, Y 
Lai, DKH 
Chan, ACH 
Cheung, DSK 
Kearns, WD
Wong, DWC 
Cheung, JCW 
Issue Date: 2024
Source: IEEE journal of biomedical and health informatics, Date of Publication: 23 July 2024, Early Access, https://dx.doi.org/10.1109/JBHI.2024.3432195
Abstract: The accuracy of sleep posture assessment in standard polysomnography might be compromised by the unfamiliar sleep lab environment. In this work, we aim to develop a depth camera-based sleep posture monitoring and classification system for home or community usage and tailor a deep learning model that can account for blanket interference. Our model included a joint coordinate estimation network (JCE) and sleep posture classification network (SPC). SaccpaNet (Separable Atrous Convolution-based Cascade Pyramid Attention Network) was developed using a combination of pyramidal structure of residual separable atrous convolution unit to reduce computational cost and enlarge receptive field. The Saccpa attention unit served as the core of JCE and SPC, while different backbones for SPC were also evaluated. The model was cross-modally pretrained by RGB images from the COCO whole body dataset and then trained/tested using dept image data collected from 150 participants performing seven sleep postures across four blanket conditions. Besides, we applied a data augmentation technique that used intra-class mix-up to synthesize blanket conditions; and an overlaid flip-cut to synthesize partially covered blanket conditions for a robustness that we referred to as the Post-hoc Data Augmentation Robustness Test (PhD-ART). Our model achieved an average precision of estimated joint coordinate (in terms of PCK@0.1) of 0.652 and demonstrated adequate robustness. The overall classification accuracy of sleep postures (F1-score) was 0.885 and 0.940, for 7- and 6-class classification, respectively. Our system was resistant to the interference of blanket, with a spread difference of 2.5%.
Keywords: Convolution
Accuracy
Feature extraction
Data models
Robustness
Cameras
Kernel
Computer vision
Deep learning
Human activity recognition
Image classification
Sleep
Publisher: Institute of Electrical and Electronics Engineers
Journal: IEEE journal of biomedical and health informatics 
ISSN: 2168-2194
EISSN: 2168-2208
DOI: 10.1109/JBHI.2024.3432195
Rights: This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License (https://creativecommons.org/licenses/by-nc-nd/4.0/).
The following publication Tam, A. Y. C., Mao, Y. J., Lai, D. K. H., Chan, A. C. H., Cheung, D. S. K., Kearns, W., ... & Cheung, J. C. W. (2024). SaccpaNet: A Separable Atrous Convolution-based Cascade Pyramid Attention Network to Estimate Body Landmarks Using Cross-modal Knowledge Transfer for Under-blanket Sleep Posture Classification. IEEE journal of biomedical and health informatics, 1-12 is available at https://doi.org/10.1109/JBHI.2024.3432195.
Appears in Collections:Journal/Magazine Article

Files in This Item:
File Description SizeFormat 
Tam_SaccpaNet_Separable_Atrous.pdf1.51 MBAdobe PDFView/Open
Open Access Information
Status open access
File Version Version of Record
Access
View full-text via PolyU eLinks SFX Query
Show full item record

SCOPUSTM   
Citations

2
Citations as of Nov 21, 2025

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.