Distributed and hierarchical deep reinforcement learning for multi-robot autonomous cooperation

Liang, Zhixuan

Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/104642

DC Field	Value	Language
dc.contributor	Department of Computing	-
dc.creator	Liang, Zhixuan	-
dc.identifier.uri	https://theses.lib.polyu.edu.hk/handle/200/12812	-
dc.language.iso	English	-
dc.title	Distributed and hierarchical deep reinforcement learning for multi-robot autonomous cooperation	-
dc.type	Thesis	-
dcterms.abstract	At present, multi-robot systems (MRS) have attracted extensive attention for their application in such settings as package delivery, space exploration and autonomous driving. A fundamental problem in MRS is how multiple robots can cooperate to perform a common goal or task. The recent development of deep reinforcement learning (DRL) provides a solution to enable robots to learn to cooperate in dynamic and complex environments. However, existing DRL approaches tend to rely on centralized training and flat neural architecture design, leading to potential issues such as single point of failure, bottleneck of performance, and low learning efficiency. Therefore, the goal of this thesis is to develop and implement effective and efficient DRL approaches for real-world multi-robot cooperation.	-
dcterms.abstract	Several significant challenges must be surmounted in achieving this goal. First, the primary foundation of DRL is the trial-and-error process; each robot senses the state of the environment, takes an action, and receives a corresponding reward to update its policy network. However, such a trial-and-error process can be prohibitively costly in real-world multi-robot settings. Secondly, the policy search space of each robot can expand significantly as it needs to consider the states and actions of other robots. This expansion leads to a high learning complexity, making it challenging to find optimal cooperative strategies. Third, designing a suitable reward signal for each robot is non-trivial due to the lack of prior knowledge to quantify the individual impact on the team’s cooperation. The conventional approach of assigning a shared global reward may lack fairness and impair learning efficiency.	-
dcterms.abstract	To tackle these challenges, this thesis starts by designing and developing a training and evaluation platform that incorporates diverse cooperative scenarios, a social agent modeling algorithm, an RL-friendly API design, and a generalization evaluation metric. This platform is a benchmark environment for safe training and testing of different DRL approaches. Then, we propose a novel distributed and hierarchical learning approach that includes high-level cooperative decision-making and low-level individual control. The cooperation of multiple robots can be efficiently learned in high-level discrete action space, while the low-level individual control can be reduced to single-agent reinforcement learning. Our approach reduces the learning complexity by decomposing the overall task into sub-tasks in a hierarchical way. In addition, we propose a communication-efficient hierarchical reinforcement learning approach to facilitate multi-robot communication in a partially observable environment. Furthermore, we propose a novel reward function design, named SVR (Shapley-value-based Reward), inspired by the economics and cooperative game theorem. This method offers a model-free approach to quantify each individual’s contribution, adopting a widely accepted fairness computation in economics. All the above approaches are trained and evaluated in our developed platform. The experiment results demonstrate the effectiveness and efficiency of our approaches in low collision rate, high success lane change rate, and fast convergence speed.	-
dcterms.abstract	In conclusion, we summarize the lessons we learned during training and developing real-world applications and open questions and explore several future research directions.	-
dcterms.accessRights	open access	-
dcterms.educationLevel	Ph.D.	-
dcterms.extent	xviii, 131 pages : color illustrations	-
dcterms.issued	2024	-
dcterms.LCSH	Robots	-
dcterms.LCSH	Intelligent control systems	-
dcterms.LCSH	Reinforcement learning	-
dcterms.LCSH	Machine learning	-
dcterms.LCSH	Hong Kong Polytechnic University -- Dissertations	-
Appears in Collections:	Thesis

Access

View full-text via https://theses.lib.polyu.edu.hk/handle/200/12812

Show simple item record

Page views

145

Last Week
3

Last month

Citations as of Nov 30, 2025

Google Scholar^TM

Check

Access

Page views

Google ScholarTM

Google Scholar^TM