Distributed and hierarchical deep reinforcement learning for multi-robot autonomous cooperation

Liang, Zhixuan

Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/104642

Title:	Distributed and hierarchical deep reinforcement learning for multi-robot autonomous cooperation
Authors:	Liang, Zhixuan
Degree:	Ph.D.
Issue Date:	2024
Abstract:	At present, multi-robot systems (MRS) have attracted extensive attention for their application in such settings as package delivery, space exploration and autonomous driving. A fundamental problem in MRS is how multiple robots can cooperate to perform a common goal or task. The recent development of deep reinforcement learning (DRL) provides a solution to enable robots to learn to cooperate in dynamic and complex environments. However, existing DRL approaches tend to rely on centralized training and flat neural architecture design, leading to potential issues such as single point of failure, bottleneck of performance, and low learning efficiency. Therefore, the goal of this thesis is to develop and implement effective and efficient DRL approaches for real-world multi-robot cooperation. Several significant challenges must be surmounted in achieving this goal. First, the primary foundation of DRL is the trial-and-error process; each robot senses the state of the environment, takes an action, and receives a corresponding reward to update its policy network. However, such a trial-and-error process can be prohibitively costly in real-world multi-robot settings. Secondly, the policy search space of each robot can expand significantly as it needs to consider the states and actions of other robots. This expansion leads to a high learning complexity, making it challenging to find optimal cooperative strategies. Third, designing a suitable reward signal for each robot is non-trivial due to the lack of prior knowledge to quantify the individual impact on the team’s cooperation. The conventional approach of assigning a shared global reward may lack fairness and impair learning efficiency. To tackle these challenges, this thesis starts by designing and developing a training and evaluation platform that incorporates diverse cooperative scenarios, a social agent modeling algorithm, an RL-friendly API design, and a generalization evaluation metric. This platform is a benchmark environment for safe training and testing of different DRL approaches. Then, we propose a novel distributed and hierarchical learning approach that includes high-level cooperative decision-making and low-level individual control. The cooperation of multiple robots can be efficiently learned in high-level discrete action space, while the low-level individual control can be reduced to single-agent reinforcement learning. Our approach reduces the learning complexity by decomposing the overall task into sub-tasks in a hierarchical way. In addition, we propose a communication-efficient hierarchical reinforcement learning approach to facilitate multi-robot communication in a partially observable environment. Furthermore, we propose a novel reward function design, named SVR (Shapley-value-based Reward), inspired by the economics and cooperative game theorem. This method offers a model-free approach to quantify each individual’s contribution, adopting a widely accepted fairness computation in economics. All the above approaches are trained and evaluated in our developed platform. The experiment results demonstrate the effectiveness and efficiency of our approaches in low collision rate, high success lane change rate, and fast convergence speed. In conclusion, we summarize the lessons we learned during training and developing real-world applications and open questions and explore several future research directions.
Subjects:	Robots Intelligent control systems Reinforcement learning Machine learning Hong Kong Polytechnic University -- Dissertations
Pages:	xviii, 131 pages : color illustrations
Appears in Collections:	Thesis

Access

View full-text via https://theses.lib.polyu.edu.hk/handle/200/12812

Show full item record

Page views

145

Last Week
3

Last month

Citations as of Nov 30, 2025

Google Scholar^TM

Check

Access

Page views

Google ScholarTM

Google Scholar^TM