More about HKUST
Towards efficient reinforcement learning
Speaker: Dongruo Zhou UCLA Title: "Towards efficient reinforcement learning" Date: Thursday, 8 December 2022 Time: 10:00am - 11:00am Zoom Link: https://hkust.zoom.us/j/465698645?pwd=aVRaNWs2RHNFcXpnWGlkR05wTTk3UT09 Meeting ID: 465-698-645 Passcode: 20222023 Abstract: Reinforcement learning (RL) has achieved great empirical success in multiple real-world problems last few years. However, many RL algorithms are also considered inefficient due to their data-hungry and computationally expensive nature. In this talk, I will give a selective overview of my recent research, which aims to make RL efficient from its two elemental aspects: exploration and exploitation. In the first part of the talk, I will focus on efficient exploration methods for RL. I will introduce a series of neural network-based exploration strategies for the contextual bandit problem, which is the basic setting for many popular RL algorithms. My work suggests the proposed exploration strategy makes RL with neural networks both theoretically sound and empirically promising. In the second part of the talk, I will talk about RL with efficient exploitation. I will introduce a weighted linear regression scheme whose weights are variance-dependent. My work shows that the proposed scheme can accelerate the model learning process of RL over existing samples, and can boost the performance of many existing RL algorithms. For example, when the state and action spaces are large, the weighted linear regression scheme gives us the first computationally efficient RL algorithm whose sample complexity to find the optimal policy is nearly independent of the planning horizon and near-optimal, which resolves a fundamental theoretical gap in RL theory. ****************** Biography: Dongruo Zhou is a final-year PhD student in the Department of Computer Science at UCLA, advised by Prof. Quanquan Gu. His research is broadly on the foundation of machine learning, with a particular focus on reinforcement learning and stochastic optimization. He aims to provide a theoretical understanding of machine learning methods, as well as to develop new machine learning algorithms with better performance. He is a recipient of the UCLA dissertation year fellowship.