Deep reinforcement learning approach for dynamic distribution network reconfiguration based on sequential masking

Wang, R; Bi, X; Bu, S; Tang, Z

doi:10.1109/TNNLS.2025.3574208

Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/116189

DC Field	Value	Language
dc.contributor	Department of Electrical and Electronic Engineering	en_US
dc.contributor	International Centre of Urban Energy Nexus	en_US
dc.contributor	Policy Research Centre for Innovation and Technology	en_US
dc.contributor	Research Institute for Smart Energy	en_US
dc.contributor	Mainland Development Office	en_US
dc.creator	Wang, R	en_US
dc.creator	Bi, X	en_US
dc.creator	Bu, S	en_US
dc.creator	Tang, Z	en_US
dc.date.accessioned	2025-11-26T04:20:57Z	-
dc.date.available	2025-11-26T04:20:57Z	-
dc.identifier.issn	2162-237X	en_US
dc.identifier.uri	http://hdl.handle.net/10397/116189	-
dc.language.iso	en	en_US
dc.publisher	Institute of Electrical and Electronics Engineers	en_US
dc.rights	© 2025 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.	en_US
dc.rights	The following publication R. Wang, X. Bi, S. Bu and Z. Tang, 'Deep Reinforcement Learning Approach for Dynamic Distribution Network Reconfiguration Based on Sequential Masking,' in IEEE Transactions on Neural Networks and Learning Systems, vol. 36, no. 10, pp. 19270-19284, Oct. 2025 is available at https://doi.org/10.1109/TNNLS.2025.3574208.	en_US
dc.subject	Deep reinforcement learning (DRL)	en_US
dc.subject	Dynamic distribution network reconfiguration (DDNR)	en_US
dc.subject	Sequential masking	en_US
dc.subject	Soft actor critic (SAC)	en_US
dc.title	Deep reinforcement learning approach for dynamic distribution network reconfiguration based on sequential masking	en_US
dc.type	Journal/Magazine Article	en_US
dc.identifier.spage	19270	en_US
dc.identifier.epage	19284	en_US
dc.identifier.volume	36	en_US
dc.identifier.issue	10	en_US
dc.identifier.doi	10.1109/TNNLS.2025.3574208	en_US
dcterms.abstract	Dynamic distribution network reconfiguration (DDNR) is a widely used technique for the secure and economic operation of power distribution networks (PDNs), especially in the presence of high-penetration renewable energy sources (RESs). DDNR is realized by controlling the on/off status of remotely controlled switches (RCSs) equipped at power lines in PDNs to optimize power flows. Thanks to the enhanced data availability of PDNs, data-driven solutions to DDNR, such as deep reinforcement learning (DRL), have gained growing attention recently. However, DDNR solves a sequence of combinatorial problems featuring a vast and sparse action space incurred by a so-called “radiality constraint,” which is highly challenging for DRLs to handle. Existing DRL methods are either unscalable to large-scale problems or potentially restrict optimality. Hence, we propose a sequential masking strategy to decompose its complex action space into a sequence of maskable sub-action spaces. A gated recurrent unit (GRU)-based agent and an adapted soft actor critic (SAC) algorithm are designed accordingly, producing a data-efficient, safety-guaranteed, and scalable DRL solution to the DDNR problem. Comprehensive comparisons with existing data-driven methods and model-based benchmarks are conducted via various case studies, demonstrating the advantages of the proposed method in both algorithmic performance and scalability.	en_US
dcterms.accessRights	open access	en_US
dcterms.bibliographicCitation	IEEE transactions on neural networks and learning systems, Oct. 2025, v. 36, no. 10, p. 19270-19284	en_US
dcterms.isPartOf	IEEE transactions on neural networks and learning systems	en_US
dcterms.issued	2025-10	-
dc.identifier.scopus	2-s2.0-105007713620	-
dc.identifier.eissn	2162-2388	en_US
dc.description.validate	202511 bcjz	en_US
dc.description.oa	Accepted Manuscript	en_US
dc.identifier.SubFormID	G000388/2025-07	-
dc.description.fundingSource	Others	en_US
dc.description.fundingText	PolyU for RISE Seed Project (Grant Number: U-CDC8); PolyU for PReCIT Seed Project (Grant Number: 1-CE16); PolyU for Intra-Faculty Interdisciplinary Project (Grant Number: 1-WZ4L); Beijing Normal-Hong Kong Baptist University for Start-up Fund (Grant Number: UICR0700116-25)	en_US
dc.description.pubStatus	Published	en_US
dc.description.oaCategory	Green (AAM)	en_US
Appears in Collections:	Journal/Magazine Article

Files in This Item:

File	Description	Size	Format
Wang_Deep_Reinforcement_Learning.pdf	Pre-Published version	4.73 MB	Adobe PDF	View/Open

Open Access Information

Status	open access
File Version	Final Accepted Manuscript

Access

View full-text via PolyU eLinks

Show simple item record

Google Scholar^TM

Check

Files in This Item:

Open Access Information

Access

Google ScholarTM

Altmetric

Google Scholar^TM