Multi-modal data sensing and fusion for complex indoor scenes

PhD Thesis Proposal Defence


Title: "Multi-modal data sensing and fusion for complex indoor scenes"

By

Miss Rongrong GAO


Abstract:

In the field of intelligent robotics, the necessity for advanced vision 
capabilities to facilitate perception and interaction in the real world is 
paramount. While significant strides have been made in computer vision in 
recent years, the predominant paradigm revolves around the analysis of RGB 
images, yielding 2D outputs in the digital realm, such as bounding boxes and 
masks. This traditional approach, while valuable, exhibits limitations when 
applied to the complex and multidimensional challenges presented in real-world 
robotics scenarios. Only RGB information can not sufficiently capture the 
(three-dimensional) spatial and contextual nuances essential for effective 
interaction in diverse environments. Consequently, this discrepancy calls for a 
paradigm shift in vision capabilities, emphasizing the need for multi-modal 
representations to enrich the robots' understanding of their surroundings.

Incorporating additional sensing modalities, such as depth imaging and 
time-of-flight imaging, becomes instrumental in providing a more holistic and 
nuanced perception of the physical environment. To this end, this dissertation 
delves into a comprehensive exploration of multi-modal perception within 
robotic scenarios. The primary focus encompasses three pivotal topics:

o Geometric learning of time-of-flight data including depth and normal 
  estimation;
o Colorization of three-dimensional point cloud data for better scene 
  understanding;
o Multi-modal data compression and decompression for better online storage and 
  sharing.

This dissertation conducts a systematic technical survey on each of these three 
subtopics, including the current status of technological development and 
existing achievements in each sub-field, with an in-depth analysis.

My doctoral dissertation will embark on a series of groundbreaking scientific 
research for these three topics, based on this extensive survey and review. The 
research endeavors to contribute state-of-the-art models, innovative methods, 
and curated data sets to related sub-fields, thereby propelling a step towards 
enhancing intelligent perception for robots. Besides, this review is also 
expected to furnish relevant researchers in the field with a lucid 
understanding of the field to guide future research and development.


Date:                   Thursday, 25 April 2024

Time:                   3:00pm - 5:00pm

Venue:                  Room 5510
                        Lifts 25/26

Committee Members:      Dr. Qifeng Chen (Supervisor)
                        Dr. Yangqiu Song (Chairperson)
                        Dr. Long Chen
                        Dr. Dan Xu