A cross-feature interaction network for 3D human pose estimation

Peng, J; Zhou, Y; Mok, PY

doi:10.1016/j.patrec.2025.01.016

Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/111360

Title:	A cross-feature interaction network for 3D human pose estimation
Authors:	Peng, J Zhou, Y Mok, PY
Issue Date:	Mar-2025
Source:	Pattern recognition letters, Mar. 2025, v. 189, p. 175-181
Abstract:	The task of estimating 3D human poses from single monocular images is challenging because, unlike video sequences, single images can hardly provide any temporal information for the prediction. Most existing methods attempt to predict 3D poses by modeling the spatial dependencies inherent in the anatomical structure of the human skeleton, yet these methods fail to capture the complex local and global relationships that exist among various joints. To solve this problem, we propose a novel Cross-Feature Interaction Network to effectively model spatial correlations between body joints. Specifically, we exploit graph convolutional networks (GCNs) to learn the local features between neighboring joints and the self-attention structure to learn the global features among all joints. We then design a cross-feature interaction (CFI) module to facilitate cross-feature communications among the three different features, namely the local features, global features, and initial 2D pose features, aggregating them to form enhanced spatial representations of human pose. Furthermore, a novel graph-enhanced module (GraMLP) with parallel GCN and multi-layer perceptron is introduced to inject the skeletal knowledge of the human body into the final representation of 3D pose. Extensive experiments on two datasets (Human3.6M (Ionescu et al., 2013) and MPI-INF-3DHP (Mehta et al., 2017)) show the superior performance of our method in comparison to existing state-of-the-art (SOTA) models. The code and data are shared at https://github.com/JihuaPeng/CFI-3DHPE
Keywords:	3D human pose estimation Cross-attention Graph convolutional network (GCN) Self-attention
Publisher:	Elsevier
Journal:	Pattern recognition letters
ISSN:	0167-8655
EISSN:	1872-7344
DOI:	10.1016/j.patrec.2025.01.016
Rights:	© 2025 The Authors. Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by- nc-nd/4.0/). The following publication Peng, J., Zhou, Y., & Mok, P. Y. (2025). A cross-feature interaction network for 3D human pose estimation. Pattern Recognition Letters, 189, 175-181 is available at https://doi.org/10.1016/j.patrec.2025.01.016.
Appears in Collections:	Journal/Magazine Article

Files in This Item:

File	Description	Size	Format
1-s2.0-S0167865525000157-main.pdf		1.78 MB	Adobe PDF	View/Open

Open Access Information

Status	open access
File Version	Version of Record

Access

View full-text via PolyU eLinks

Show full item record

Google Scholar^TM

Check

Files in This Item:

Open Access Information

Access

Google ScholarTM

Altmetric

Google Scholar^TM