Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/118083
PIRA download icon_1.1View/Download Full Text
Title: Canonical shape reconstruction with SE(3) equivariance learning for weakly-supervised object pose estimation
Authors: Zhou, J 
Chen, K
Wei, M
Zhang, XP
Dou, Q
Qin, J 
Issue Date: Jul-2025
Source: IEEE transactions on circuits and systems for video technology, July 2025, v. 35, no. 7, p. 6895-6909
Abstract: 6D object pose estimation from a single RGB-D image is a fundamental problem in computer vision and robot manipulation. Despite recent advancements, existing methods still suffer several limitations. First of all, the object shape representation extracted from the depth map is often less expressive because the object point cloud parsed from the depth map is highly incomplete due to the object self-occlusion and noisy due to the sensor artifacts. This shape representation issue further intensifies when lacking sufficient labeled data for model training, which unfortunately is another typical problem for object pose estimation considering the heavy annotation cost for real-world pose labeling. In this study, we propose to tackle the above issues in a unified way. First, we enhance the object shape representation from the partial point cloud with a novel canonical shape reconstruction module, in which an implicit canonical frame is established by incorporating the SE(3) equivariance, achieving implicit feature alignment of the partial point cloud inputs, leading to robust shape recovery. Second, based on the enhanced object representation, we further utilize the de-canonicalized and pose-dependent completed object shape as the training signal, and develop a novel weakly-supervised learning framework to leverage both labeled synthetic data and unlabeled real data to train the pose estimation model in a label-efficient way. Extensive experiments on three widely used benchmarks demonstrate the effectiveness, and superiority of our framework over state-of-the-art methods.
Keywords: 6D object pose estimation
SE(3) equivariance
Shape completion in arbitrary poses
Weakly-supervised training
Publisher: Institute of Electrical and Electronics Engineers
Journal: IEEE transactions on circuits and systems for video technology 
ISSN: 1051-8215
EISSN: 1558-2205
DOI: 10.1109/TCSVT.2025.3542089
Rights: © 2025 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
The following publication J. Zhou, K. Chen, M. Wei, X. -P. Zhang, Q. Dou and J. Qin, 'Canonical Shape Reconstruction With SE(3) Equivariance Learning for Weakly-Supervised Object Pose Estimation,' in IEEE Transactions on Circuits and Systems for Video Technology, vol. 35, no. 7, pp. 6895-6909, July 2025 is available at https://doi.org/10.1109/TCSVT.2025.3542089.
Appears in Collections:Journal/Magazine Article

Files in This Item:
File Description SizeFormat 
Zhou_Canonical_Shape_Reconstruction.pdfPre-Published version10.47 MBAdobe PDFView/Open
Open Access Information
Status open access
File Version Final Accepted Manuscript
Access
View full-text via PolyU eLinks SFX Query
Show full item record

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.