Structured Multi-Modal and Multi-Task Deep Learning for 2D/3D scene Understanding

Date: 23 September 2020 (Wednesday)
Time: 6-7:30pm
Venue: Online via Zoom Meeting
Meeting ID: 993 9758 3617
Passcode: 1q866m
Title: Structured Multi-Modal and Multi-Task Deep Learning for 2D/3D scene Understanding
Speaker: Dr. Dan Xu, CSE

Introduction:

In this talk, Dan will introduce his research on structured, multi-modal and multi-task deep learning. The talk will cover several important computer vision tasks within the context of visual scene understanding, including scene depth estimation, joint learning of scene depth and scene parsing, and visual SLAM. Specifically, he will present statistical modelling based deep model design for structured neural network prediction and representation learning, a novel prediction-and-distillation (PAD) network structure for deep multi-task learning, and some recent research directions on learning the interaction between 2D and 3D data and tasks in scene understanding for self-driving and robotics applications.

Speaker:

Dr. Dan Xu is an Assistant Professor in the Department of Computer Science and Engineering (CSE), Hong Kong University of Science and Technology (HKUST). Before joining HKUST, he was a postdoctoral researcher in the Visual Geometry Group (VGG) in the Department of Engineering Science at the University of Oxford. He received his Ph.D. in Computer Science from the University of Trento in 2018. He was also a visiting Ph.D. student in the MM lab at the Chinese University of Hong Kong (CUHK). His research mainly focuses on computer vision, multimedia and deep learning. He received the Best Scientific Paper award at ICPR 2016 and a best paper nominee at ACM Multimedia 2018.

Download presentation slides.