Priority-driven reinforcement learning for multi-aircraft trajectory optimisation under dynamic weather hazards

Zhu, C; Ng, KKH; Chan, PW; Liu, Y; Leung, CYY

doi:10.1016/j.tre.2025.104496

Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/118072

DC Field	Value	Language
dc.contributor	Department of Aeronautical and Aviation Engineering	en_US
dc.creator	Zhu, C	en_US
dc.creator	Ng, KKH	en_US
dc.creator	Chan, PW	en_US
dc.creator	Liu, Y	en_US
dc.creator	Leung, CYY	en_US
dc.date.accessioned	2026-03-12T01:17:27Z	-
dc.date.available	2026-03-12T01:17:27Z	-
dc.identifier.issn	1366-5545	en_US
dc.identifier.uri	http://hdl.handle.net/10397/118072	-
dc.language.iso	en	en_US
dc.publisher	Elsevier Ltd	en_US
dc.subject	Air traffic control	en_US
dc.subject	Dynamic weather hazard	en_US
dc.subject	Quickest priority-based conflict resolution	en_US
dc.subject	Self-attention mechanism	en_US
dc.subject	Trajectory optimisation	en_US
dc.title	Priority-driven reinforcement learning for multi-aircraft trajectory optimisation under dynamic weather hazards	en_US
dc.type	Journal/Magazine Article	en_US
dc.identifier.volume	205	en_US
dc.identifier.doi	10.1016/j.tre.2025.104496	en_US
dcterms.abstract	Reinforcement Learning (RL) has emerged as a state-of-the-art technique for addressing challenges in air traffic control, and weather hazards and flight procedures can contribute to information biases when applying RL to real-world scenarios. This research focuses on the 3D Multi-Aircraft Trajectory Optimisation (3D-MATO) problem under dynamic weather hazards within the Terminal Manoeuvring Area and addresses the aforementioned concerns. We propose an integrated RL-based algorithm incorporating weather avoidance and quick conflict resolution. Given observed weather radar in Flight Information Regions (FIRs), we introduce the Dynamic Fast Marching Method (DFMM) algorithm to reroute flight paths at smaller time intervals, ensuring safer navigation around hazardous regions. To enhance decision-making quality, we develop a Quickest Priority-based Conflict Resolution (QPCR) strategy, which optimises approach sequences and refines available action choices. The RL agent is trained using a Deep Deterministic Policy Gradient (DDPG) framework, and further enhanced with a self-attention mechanism. A numerical study modelled the real-world approach procedures at Hong Kong International Airport involving varying numbers of approach aircraft under dynamic weather hazards. Results demonstrate the high efficiency and effectiveness of the proposed algorithm under traffic mix and weather conditions, highlighting the contributions of its key strategies and individual components.	en_US
dcterms.accessRights	embargoed access	en_US
dcterms.bibliographicCitation	Transportation research. Part E, Logistics and transportation review, Jan. 2026, v. 205, 104496	en_US
dcterms.isPartOf	Transportation research. Part E, Logistics and transportation review	en_US
dcterms.issued	2026-01	-
dc.identifier.scopus	2-s2.0-105022191268	-
dc.identifier.eissn	1878-5794	en_US
dc.identifier.artn	104496	en_US
dc.description.validate	202603 bchy	en_US
dc.description.oa	Not applicable	en_US
dc.identifier.SubFormID	G001174/2026-01	-
dc.description.fundingSource	RGC	en_US
dc.description.fundingSource	Others	en_US
dc.description.fundingText	The work described in this paper was supported by grants from the Research Grants Council, the Hong Kong Government (Grant no. PolyU15201423), Department of Aeronautical and Aviation Engineering, The Hong Kong Polytechnic University, Hong Kong SAR (RMBU, RJ78), the National Natural Science Foundation of China (Grant number: 72301229), and the Research Institute of Sustainable Urban Development (BBG5).	en_US
dc.description.pubStatus	Published	en_US
dc.date.embargo	2029-01-31	en_US
dc.description.oaCategory	Green (AAM)	en_US
Appears in Collections:	Journal/Magazine Article

Open Access Information

Status	embargoed access
Embargo End Date	2029-01-31

Access

View full-text via PolyU eLinks

Show simple item record

Google Scholar^TM

Check

Open Access Information

Access

Google ScholarTM

Altmetric

Google Scholar^TM