Enhancing human detection in occlusion-heavy disaster scenarios: a visibility-enhanced DINO (VE-DINO) model with reassembled occlusion dataset

Zhao, ZA; Wang, SD; Chen, MX; Mao, YJ; Chan, ACH; Lai, DKH; Wong, DWC; Cheung, JCW

doi:10.3390/smartcities8010012

Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/114991

Title:	Enhancing human detection in occlusion-heavy disaster scenarios: a visibility-enhanced DINO (VE-DINO) model with reassembled occlusion dataset
Authors:	Zhao, ZA Wang, SD Chen, MX Mao, YJ Chan, ACH Lai, DKH Wong, DWC Cheung, JCW
Issue Date:	Feb-2025
Source:	Smart cities, Feb. 2025, v. 8, no. 1, 12
Abstract:	Natural disasters create complex environments where effective human detection is both critical and challenging, especially when individuals are partially occluded. While recent advancements in computer vision have improved detection capabilities, there remains a significant need for efficient solutions that can enhance search-and-rescue (SAR) operations in resource-constrained disaster scenarios. This study modified the original DINO (Detection Transformer with Improved Denoising Anchor Boxes) model and introduced the visibility-enhanced DINO (VE-DINO) model, designed for robust human detection in occlusion-heavy environments, with potential integration into SAR system. VE-DINO enhances detection accuracy by incorporating body part key point information and employing a specialized loss function. The model was trained and validated using the COCO2017 dataset, with additional external testing conducted on the Disaster Occlusion Detection Dataset (DODD), which we developed by meticulously compiling relevant images from existing public datasets to represent occlusion scenarios in disaster contexts. The VE-DINO achieved an average precision of 0.615 at IoU 0.50:0.90 on all bounding boxes, outperforming the original DINO model (0.491) in the testing set. The external testing of VE-DINO achieved an average precision of 0.500. An ablation study was conducted and demonstrated the robustness of the model subject when confronted with varying degrees of body occlusion. Furthermore, to illustrate the practicality, we conducted a case study demonstrating the usability of the model when integrated into an unmanned aerial vehicle (UAV)-based SAR system, showcasing its potential in real-world scenarios.
Keywords:	SAR operations UAVs Deep learning DINO Occlusion detection Disaster Occlusion Detection Dataset (DODD) Human detection Natural disasters Resource-constrained environments
Publisher:	MDPI AG
Journal:	Smart cities
EISSN:	2624-6511
DOI:	10.3390/smartcities8010012
Rights:	© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). The following publication Zhao, Z.-A., Wang, S., Chen, M.-X., Mao, Y.-J., Chan, A. C.-H., Lai, D. K.-H., Wong, D. W.-C., & Cheung, J. C.-W. (2025). Enhancing Human Detection in Occlusion-Heavy Disaster Scenarios: A Visibility-Enhanced DINO (VE-DINO) Model with Reassembled Occlusion Dataset. Smart Cities, 8(1), 12 is available at https://dx.doi.org/10.3390/smartcities8010012.
Appears in Collections:	Journal/Magazine Article

Files in This Item:

File	Description	Size	Format
smartcities-08-00012-v2.pdf		22.6 MB	Adobe PDF	View/Open

Open Access Information

Status	open access
File Version	Version of Record

Access

View full-text via PolyU eLinks

Show full item record

WEB OF SCIENCE^TM
Citations

2

Citations as of Nov 20, 2025

Google Scholar^TM

Check

Files in This Item:

Open Access Information

Access

WEB OF SCIENCETM Citations

Google ScholarTM

Altmetric

WEB OF SCIENCE^TM
Citations

Google Scholar^TM