More about HKUST
Towards efficient reinforcement learning
Speaker: Dongruo Zhou
UCLA
Title: "Towards efficient reinforcement learning"
Date: Thursday, 8 December 2022
Time: 10:00am - 11:00am
Zoom Link:
https://hkust.zoom.us/j/465698645?pwd=aVRaNWs2RHNFcXpnWGlkR05wTTk3UT09
Meeting ID: 465-698-645
Passcode: 20222023
Abstract:
Reinforcement learning (RL) has achieved great empirical success in
multiple real-world problems last few years. However, many RL algorithms
are also considered inefficient due to their data-hungry and
computationally expensive nature. In this talk, I will give a selective
overview of my recent research, which aims to make RL efficient from its
two elemental aspects: exploration and exploitation.
In the first part of the talk, I will focus on efficient exploration
methods for RL. I will introduce a series of neural network-based
exploration strategies for the contextual bandit problem, which is the
basic setting for many popular RL algorithms. My work suggests the
proposed exploration strategy makes RL with neural networks both
theoretically sound and empirically promising.
In the second part of the talk, I will talk about RL with efficient
exploitation. I will introduce a weighted linear regression scheme whose
weights are variance-dependent. My work shows that the proposed scheme can
accelerate the model learning process of RL over existing samples, and can
boost the performance of many existing RL algorithms. For example, when
the state and action spaces are large, the weighted linear regression
scheme gives us the first computationally efficient RL algorithm whose
sample complexity to find the optimal policy is nearly independent of the
planning horizon and near-optimal, which resolves a fundamental
theoretical gap in RL theory.
******************
Biography:
Dongruo Zhou is a final-year PhD student in the Department of Computer
Science at UCLA, advised by Prof. Quanquan Gu. His research is broadly on
the foundation of machine learning, with a particular focus on
reinforcement learning and stochastic optimization. He aims to provide a
theoretical understanding of machine learning methods, as well as to
develop new machine learning algorithms with better performance. He is a
recipient of the UCLA dissertation year fellowship.