Please use this identifier to cite or link to this item:
http://hdl.handle.net/10397/115534
| DC Field | Value | Language |
|---|---|---|
| dc.contributor | Department of Electrical and Electronic Engineering | en_US |
| dc.creator | Zhang, Y | en_US |
| dc.creator | Wang, Y | en_US |
| dc.creator | Cui, Y | en_US |
| dc.creator | Chau, LP | en_US |
| dc.date.accessioned | 2025-10-06T03:10:43Z | - |
| dc.date.available | 2025-10-06T03:10:43Z | - |
| dc.identifier.issn | 1520-9210 | en_US |
| dc.identifier.uri | http://hdl.handle.net/10397/115534 | - |
| dc.language.iso | en | en_US |
| dc.publisher | Institute of Electrical and Electronics Engineers | en_US |
| dc.rights | © 2025 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works. | en_US |
| dc.rights | The following publication Y. Zhang, Y. Wang, Y. Cui and L. -P. Chau, '3DGeoDet: General-Purpose Geometry-Aware Image-Based 3D Object Detection,' in IEEE Transactions on Multimedia, vol. 27, pp. 6235-6247, 2025 is available at https://doi.org/10.1109/TMM.2025.3581780. | en_US |
| dc.subject | 3D geometry | en_US |
| dc.subject | Monocular 3D object detection | en_US |
| dc.subject | Multi-view 3D object detection | en_US |
| dc.subject | Voxel occupancy | en_US |
| dc.title | 3DGeoDet : general-purpose geometry-aware image-based 3D object detection | en_US |
| dc.type | Journal/Magazine Article | en_US |
| dc.identifier.spage | 6235 | en_US |
| dc.identifier.epage | 6247 | en_US |
| dc.identifier.volume | 27 | en_US |
| dc.identifier.doi | 10.1109/TMM.2025.3581780 | en_US |
| dcterms.abstract | This paper proposes 3DGeoDet, a novel geometry-aware 3D object detection approach that effectively handles single- and multi-view RGB images in indoor and outdoor environments, showcasing its general-purpose applicability. The key challenge for image-based 3D object detection tasks is the lack of 3D geometric cues, which leads to ambiguity in establishing correspondences between images and 3D representations. To tackle this problem, 3DGeoDet generates efficient 3D geometric representations in both explicit and implicit manners based on predicted depth information. Specifically, we utilize the predicted depth to learn voxel occupancy and optimize the voxelized 3D feature volume explicitly through the proposed voxel occupancy attention. To further enhance 3D awareness, the feature volume is integrated with an implicit 3D representation, the truncated signed distance function (TSDF). Without requiring supervision from 3D signals, we significantly improve the model’s comprehension of 3D geometry by leveraging intermediate 3D representations and achieve end-to-end training. Our approach surpasses the performance of state-of-the-art image-based methods on both single- and multi-view benchmark datasets across diverse environments, achieving a 9.3 mAP@0.5 improvement on the SUN RGB-D dataset, a 3.3 mAP@0.5 improvement on the ScanNetV2 dataset, and a 0.19 AP3D@0.7 improvement on the KITTI dataset. | en_US |
| dcterms.accessRights | open access | en_US |
| dcterms.bibliographicCitation | IEEE transactions on multimedia, 2025, v. 27, p. 6235-6247 | en_US |
| dcterms.isPartOf | IEEE transactions on multimedia | en_US |
| dcterms.issued | 2025 | - |
| dc.identifier.scopus | 2-s2.0-105008802041 | - |
| dc.identifier.eissn | 1941-0077 | en_US |
| dc.description.validate | 202510 bcch | en_US |
| dc.description.oa | Accepted Manuscript | en_US |
| dc.identifier.SubFormID | G000212/2025-07 | - |
| dc.description.fundingSource | Others | en_US |
| dc.description.fundingText | This work was conducted at the JC STEM Lab of Machine Learning and Computer Vision funded by The Hong Kong Jockey Club Charities Trust. | en_US |
| dc.description.pubStatus | Published | en_US |
| dc.description.oaCategory | Green (AAM) | en_US |
| dc.relation.rdata | https://cindy0725.github.io/3DGeoDet/ | en_US |
| Appears in Collections: | Journal/Magazine Article | |
Files in This Item:
| File | Description | Size | Format | |
|---|---|---|---|---|
| Zhang_3DGeoDet_General_Purpose.pdf | Pre-Published version | 3.52 MB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.



