Towards efficient reinforcement learning

Speaker: Dongruo Zhou
         UCLA

Title:  "Towards efficient reinforcement learning"

Date:   Thursday, 8 December 2022

Time:   10:00am - 11:00am

Zoom Link:
https://hkust.zoom.us/j/465698645?pwd=aVRaNWs2RHNFcXpnWGlkR05wTTk3UT09

Meeting ID: 465-698-645
Passcode: 20222023

Abstract:

Reinforcement learning (RL) has achieved great empirical success in
multiple real-world problems last few years. However, many RL algorithms
are also considered inefficient due to their data-hungry and
computationally expensive nature. In this talk, I will give a selective
overview of my recent research, which aims to make RL efficient from its
two elemental aspects: exploration and exploitation.

In the first part of the talk, I will focus on efficient exploration
methods for RL. I will introduce a series of neural network-based
exploration strategies for the contextual bandit problem, which is the
basic setting for many popular RL algorithms. My work suggests the
proposed exploration strategy makes RL with neural networks both
theoretically sound and empirically promising.

In the second part of the talk, I will talk about RL with efficient
exploitation. I will introduce a weighted linear regression scheme whose
weights are variance-dependent. My work shows that the proposed scheme can
accelerate the model learning process of RL over existing samples, and can
boost the performance of many existing RL algorithms. For example, when
the state and action spaces are large, the weighted linear regression
scheme gives us the first computationally efficient RL algorithm whose
sample complexity to find the optimal policy is nearly independent of the
planning horizon and near-optimal, which resolves a fundamental
theoretical gap in RL theory.


******************
Biography:

Dongruo Zhou is a final-year PhD student in the Department of Computer
Science at UCLA, advised by Prof. Quanquan Gu. His research is broadly on
the foundation of machine learning, with a particular focus on
reinforcement learning and stochastic optimization. He aims to provide a
theoretical understanding of machine learning methods, as well as to
develop new machine learning algorithms with better performance. He is a
recipient of the UCLA dissertation year fellowship.