Data-driven deep reinforcement learning for decision-making applications

Wang, Jia

Please use this identifier to cite or link to this item: http://hdl.handle.net/10397/89757

Title:	Data-driven deep reinforcement learning for decision-making applications
Authors:	Wang, Jia
Degree:	Ph.D.
Issue Date:	2021
Abstract:	Decision-making applications have become an important part of today's competitive, knowledge-based society, which benefits many important areas. Significant progress has been made in machine learning recently, thanks to the availability of data not previously available. By learning from past data, machine learning can make better decisions than relying solely on domain knowledge. Among various machine learning algorithms, reinforcement learning (RL) is the most promising algorithm because it learns to map current conditions on decision solutions and considers the impact of current decisions on subsequent decisions. Typically, reinforcement learning learns through trial-and-error using the harvested data based on its own experience to make informed decisions. In many practical applications, since there exists a large amount of offline collected data with rich prior information, the ability to learn from big data becomes the key to reinforcement learning to solve realistic decision-making problems. Unlike traditional RL methods which interact with an online environment, learning the strategy from a fixed dataset is particularly challenging. The reasons are threefold. First, for data which are generated from the daily system operations, they are not independent and identically distributed. By training on partial dataset, an RL agent can learn the converged model which makes it reluctant to explore the remaining data and further improve the model performance. Second, without the proper understanding of underlying data distribution, an RL agent may learn a decision-making strategy that easily over-fits to the observed samples in the training set but fail to generalize well on unseen samples in the testing set. Third, an RL training process can be very unstable, when data are noisy and highly variant. In this thesis, we have studied data-driven reinforcement learning, aiming to derive decision strategies from big data collected offline. The first contribution of this thesis comes from enabling an RL agent to learn strategies from data with repetitive patterns. To force an RL agent to fully "explore" massive data, we partition the historical big dataset into multi-batch datasets. Typically, we study in both theory and practice how an RL agent can incrementally improve the strategy by learning from the multi-batch datasets. The second contribution of this thesis comes from that we explore the underlying data property distribution under the reinforcement learning scheme. With the generative distribution, one can select the hardest (most representative) samples to train the strategy model, thus achieving a better application performance. The third contribution of the thesis comes from we apply the RL method to learn strategies from high variance data. Specifically, we bound the distribution of the parameters in the new strategy relatively close to its predecessor strategy to stabilize the training. Finally, through data-driven reinforcement learning, we thoroughly study various applications, including social analysis, dynamic resource allocation, and multi-agent pattern formation.
Subjects:	Reinforcement learning Machine learning Big data Hong Kong Polytechnic University -- Dissertations
Pages:	xvi, 100 pages : color illustrations
Appears in Collections:	Thesis

Access

View full-text via https://theses.lib.polyu.edu.hk/handle/200/11095

Show full item record

Page views

42

Last Week
0

Last month

Citations as of Apr 28, 2024

Google Scholar^TM

Check

Access

Page views

Google ScholarTM

Google Scholar^TM