More about HKUST
Deep Reinforcement Learning for Continuous Control Environments with Multi-modal State-space and Sparse Rewards
MPhil Thesis Defence Title: "Deep Reinforcement Learning for Continuous Control Environments with Multi-modal State-space and Sparse Rewards" By Mr. Cancheng ZENG Abstract One of the most important problems in artificial intelligence is learning to solve control problems without human supervision. Recent advances in deep reinforcement learning methods have achieved significant progress in this domain. Researchers have solved a number of problems including a subset of Atari games, the Go game, and several simple robot control environments. However, a general solution to more realistic problems is still missing. Most real-world robot control problems have multi-modal state space, which usually consists of both a low-dimensional motion sensor input and a high-dimensional image sensor input. Apart from that, a smooth and informative reward signal is usually unavailable, and the agent is only provided a reward signal that is sparse and discrete. We study a set of continuous control problems with multi-modal state space. A subset of the problems also have sparse reward functions. We propose several techniques to improve the performance of flat reinforcement learning methods on the multi-modal state-space problems. The proposed techniques include the Wasserstein actor critic trust-region policy optimization method (W-KTR), the exceptional advantage regularization method, and the robust concentric Gaussian mixture policy model. Experiment results show that the proposed techniques, especially the exceptional advantage regularization method, lead to considerable performance improvement. A hierarchical reinforcement learning method, namely flexible-scheduling hierarchical method, is proposed for the sparse-reward multi-modal state-space problems. Experiment results show that the flexible-scheduling hierarchical method can solve the problems without domain-specific knowledge given a set of pre-defined source tasks. Date: Tuesday, 21 August 2018 Time: 2:00pm - 4:00pm Venue: Room 3494 Lifts 25/26 Committee Members: Prof. Dit-Yan Yeung (Supervisor) Prof. Nevin Zhang (Chairperson) Prof. Fangzhen Lin **** ALL are Welcome ****