Canonical shape reconstruction with SE(3) equivariance learning for weakly-supervised object pose estimation

Zhou, J; Chen, K; Wei, M; Zhang, XP; Dou, Q; Qin, J

doi:10.1109/TCSVT.2025.3542089

Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/118083

DC Field	Value	Language
dc.contributor	School of Nursing	-
dc.creator	Zhou, J	-
dc.creator	Chen, K	-
dc.creator	Wei, M	-
dc.creator	Zhang, XP	-
dc.creator	Dou, Q	-
dc.creator	Qin, J	-
dc.date.accessioned	2026-03-13T02:44:05Z	-
dc.date.available	2026-03-13T02:44:05Z	-
dc.identifier.issn	1051-8215	-
dc.identifier.uri	http://hdl.handle.net/10397/118083	-
dc.language.iso	en	en_US
dc.publisher	Institute of Electrical and Electronics Engineers	en_US
dc.rights	© 2025 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.	en_US
dc.rights	The following publication J. Zhou, K. Chen, M. Wei, X. -P. Zhang, Q. Dou and J. Qin, 'Canonical Shape Reconstruction With SE(3) Equivariance Learning for Weakly-Supervised Object Pose Estimation,' in IEEE Transactions on Circuits and Systems for Video Technology, vol. 35, no. 7, pp. 6895-6909, July 2025 is available at https://doi.org/10.1109/TCSVT.2025.3542089.	en_US
dc.subject	6D object pose estimation	en_US
dc.subject	SE(3) equivariance	en_US
dc.subject	Shape completion in arbitrary poses	en_US
dc.subject	Weakly-supervised training	en_US
dc.title	Canonical shape reconstruction with SE(3) equivariance learning for weakly-supervised object pose estimation	en_US
dc.type	Journal/Magazine Article	en_US
dc.identifier.spage	6895	-
dc.identifier.epage	6909	-
dc.identifier.volume	35	-
dc.identifier.issue	7	-
dc.identifier.doi	10.1109/TCSVT.2025.3542089	-
dcterms.abstract	6D object pose estimation from a single RGB-D image is a fundamental problem in computer vision and robot manipulation. Despite recent advancements, existing methods still suffer several limitations. First of all, the object shape representation extracted from the depth map is often less expressive because the object point cloud parsed from the depth map is highly incomplete due to the object self-occlusion and noisy due to the sensor artifacts. This shape representation issue further intensifies when lacking sufficient labeled data for model training, which unfortunately is another typical problem for object pose estimation considering the heavy annotation cost for real-world pose labeling. In this study, we propose to tackle the above issues in a unified way. First, we enhance the object shape representation from the partial point cloud with a novel canonical shape reconstruction module, in which an implicit canonical frame is established by incorporating the SE(3) equivariance, achieving implicit feature alignment of the partial point cloud inputs, leading to robust shape recovery. Second, based on the enhanced object representation, we further utilize the de-canonicalized and pose-dependent completed object shape as the training signal, and develop a novel weakly-supervised learning framework to leverage both labeled synthetic data and unlabeled real data to train the pose estimation model in a label-efficient way. Extensive experiments on three widely used benchmarks demonstrate the effectiveness, and superiority of our framework over state-of-the-art methods.	-
dcterms.accessRights	open access	en_US
dcterms.bibliographicCitation	IEEE transactions on circuits and systems for video technology, July 2025, v. 35, no. 7, p. 6895-6909	-
dcterms.isPartOf	IEEE transactions on circuits and systems for video technology	-
dcterms.issued	2025-07	-
dc.identifier.scopus	2-s2.0-85218266830	-
dc.identifier.eissn	1558-2205	-
dc.description.validate	202603 bcjz	-
dc.description.oa	Accepted Manuscript	en_US
dc.identifier.SubFormID	G001229/2025-12	en_US
dc.description.fundingSource	RGC	en_US
dc.description.fundingSource	Others	en_US
dc.description.fundingText	This work was supported in part by the General Research Fund of Hong Kong Research Grants Council under Grant 15218521, in part by the Theme-Based Research Scheme of Hong Kong Research Grants Council under Grant T45-401/22-N, in part by the Research Grants Council of Hong Kong under Grant 24209223, in part by Hong Kong Innovation and Technology Fund under Grant ITS/223/22, in part by the National Natural Science Foundation of China under Grant T2322012 and Grant 62172218, in part by Shenzhen Ubiquitous Data Enabling Key Laboratory under Grant ZDSYS20220527171406015, and in part by Tsinghua Shenzhen International Graduate School-Shenzhen Pengrui Endowed Professorship Scheme of Shenzhen Pengrui Foundation.	en_US
dc.description.pubStatus	Published	en_US
dc.description.oaCategory	Green (AAM)	en_US
Appears in Collections:	Journal/Magazine Article

Files in This Item:

File	Description	Size	Format
Zhou_Canonical_Shape_Reconstruction.pdf	Pre-Published version	10.47 MB	Adobe PDF	View/Open

Open Access Information

Status	open access
File Version	Final Accepted Manuscript

Access

View full-text via PolyU eLinks

Show simple item record

Google Scholar^TM

Check

Files in This Item:

Open Access Information

Access

Google ScholarTM

Altmetric

Google Scholar^TM