On Building Generalizable Learning Agents

Speaker:        Dr. Yi WU
                Researcher
                OpenAI Inc., USA

Title:          "On Building Generalizable Learning Agents"

Date:           Monday, 4 November 2019

Time:           2:30pm - 3:30pm

Venue:          Room 5583 (via lift no. 27/28), HKUST

Abstract:

Despite the recent practical successes by deep reinforcement learning
(DRL), one critical issue for existing DRL works is generalization. The
learned neural policy can be extremely specialized in the training
scenarios and easily fail even when the agent is tested in a scenario
slightly different from the training ones. In contrast, humans have the
ability to adapt its learn skills to unseen situations easily without
further training. This generalization challenge indicates a fundamental
gap towards our ultimate goal of building agents with artificial general
intelligence (AGI).

This talk presents progresses on this challenge. We found one of the
solutions is to enable the learning agents with long-term planning
abilities. We first describe why an agent with a simple feed-forward
policy fails to generalize well even on simple tasks, and then propose
plannable policy representations on both fully observable and partially
observable settings. We empirically show that by simply augmenting those
conventional neural agents with our proposed planning modules, the
generalization performances can be significantly improved on a variety of
tasks, including a challenging visual navigation application.


******************
Biography:

Yi Wu is now a researcher at OpenAI Inc. and will join institute of
interdisciplinary information sciences (IIIS), Tsinghua University, as an
assistant professor in 2020 fall. He recently earned his PhD degree from
UC Berkeley under the supervision of Prof. Stuart Russell. His research
focuses on improving the generalization ability of AI systems. He is
broadly interested in a variety of topics in AI, including deep
reinforcement learning, natural language processing and probabilistic
programming. His work, Value Iteration Network, won the best paper award
at NIPS 2016.