More about HKUST
Unsupervised Video Object Segmentation for Deep Reinforcement Learning
Speaker: Professor Pascal Poupart University of Waterloo Title: "Unsupervised Video Object Segmentation for Deep Reinforcement Learning" Date: Tuesday, 20 August 2019 Time: 11:00am - 12 noon Venue: Room 4472 (via lift no. 25/26), HKUST Abstract: I will present a new technique for deep reinforcement learning that automatically detects moving objects and uses the relevant information for action selection. The detection of moving objects is done in an unsupervised way by exploiting structure from motion. Instead of directly learning a policy from raw images, the agent first learns to detect and segment moving objects by exploiting flow information in video sequences. The learned representation is then used to focus the policy of the agent on the moving objects. Over time, the agent identifies which objects are critical for decision making and gradually builds a policy based on relevant moving objects. This approach, which we call Motion-Oriented REinforcement Learning (MOREL), is demonstrated on a suite of Atari games where the ability to detect moving objects reduces the amount of interaction needed with the environment to obtain a good policy. Furthermore, the resulting policy is more interpretable than policies that directly map images to actions or values with a black box neural network. We can gain insight into the policy by inspecting the segmentation and motion of each object detected by the agent. This allows practitioners to confirm whether a policy is making decisions based on sensible information. Our code is available at https://github.com/vik-goel/MOREL ****************** Biography: Pascal Poupart is a Full Professor in the David R. Cheriton School of Computer Science at the University of Waterloo, Waterloo (Canada). He is the Research Director of the Waterloo Borealis AI Research Lab funded by the Royal Bank of Canada. He is a faculty member of the Waterloo AI Institute and the Vector Institute. He serves as scientific advisor for Huawei Technologies and ProNavigator. He received the B.Sc. in Mathematics and Computer Science at McGill University, Montreal (Canada) in 1998, the M.Sc. in Computer Science at the University of British Columbia, Vancouver (Canada) in 2000 and the Ph.D. in Computer Science at the University of Toronto, Toronto (Canada) in 2005. His research focuses on the development of algorithms for reasoning under uncertainty and machine learning with application to Assistive Technologies, Natural Language Processing ad Telecommunication Networks. He is most well known for his contributions to the development of approximate scalable algorithms for partially observable Markov decision processes (POMDPs) and their applications in real-world problems, including automated prompting for people with dementia for the task of handwashing and spoken dialog management. Other notable projects that his research team are currently working on include deep learning with clear semantics, structure learning, personalized transfer learning, conversational agents, adaptive satisfiability, sports analytics and stress detection based on wearable devices. Pascal Poupart received a Cheriton Faculty Fellowship (2015-2018), a best student paper honourable mention (SAT-2017), a silver medal at the SAT-2017 competition, an outstanding collaborator award from Huawei Noah's Ark (2016), a top reviewer award (ICML-2016), a gold medal at the SAT-2016 competition, a best reviewer award (NIPS-2015), an Early Researcher Award from the Ontario Ministry of Research and Innovation (2008), two Google research awards (2007-2008), a best paper award runner up (UAI-2008) and the IAPR best paper award (ICVS-2007). He serves as associate editor of the Journal of Artificial Intelligence Research (JAIR) (2017 - present), member of the editorial board of the Journal of Machine Learning Research (JMLR) (2009 - present) and guest editor for the Machine Learning Journal (MLJ) (2012 - present). He routinely serves as area chair or senior program committee member for NIPS, ICML, AISTATS, IJCAI, AAAI and UAI. His research collaborators include Borealis AI, Huawei Technologies, Google, Intel, Ford, ProNavigator, SportLogic, Scribendi, Kik Interactive, In the Chat, Slyce, HockeyTech, the Alzheimer Association, the UW-Schlegel Research Institute for Aging, Sunnybrook Health Science Centre and the Toronto Rehabilitation Institute.