More about HKUST
Multi-modal data sensing and fusion for complex indoor scenes
PhD Thesis Proposal Defence Title: "Multi-modal data sensing and fusion for complex indoor scenes" By Miss Rongrong GAO Abstract: In the field of intelligent robotics, the necessity for advanced vision capabilities to facilitate perception and interaction in the real world is paramount. While significant strides have been made in computer vision in recent years, the predominant paradigm revolves around the analysis of RGB images, yielding 2D outputs in the digital realm, such as bounding boxes and masks. This traditional approach, while valuable, exhibits limitations when applied to the complex and multidimensional challenges presented in real-world robotics scenarios. Only RGB information can not sufficiently capture the (three-dimensional) spatial and contextual nuances essential for effective interaction in diverse environments. Consequently, this discrepancy calls for a paradigm shift in vision capabilities, emphasizing the need for multi-modal representations to enrich the robots' understanding of their surroundings. Incorporating additional sensing modalities, such as depth imaging and time-of-flight imaging, becomes instrumental in providing a more holistic and nuanced perception of the physical environment. To this end, this dissertation delves into a comprehensive exploration of multi-modal perception within robotic scenarios. The primary focus encompasses three pivotal topics: o Geometric learning of time-of-flight data including depth and normal estimation; o Colorization of three-dimensional point cloud data for better scene understanding; o Multi-modal data compression and decompression for better online storage and sharing. This dissertation conducts a systematic technical survey on each of these three subtopics, including the current status of technological development and existing achievements in each sub-field, with an in-depth analysis. My doctoral dissertation will embark on a series of groundbreaking scientific research for these three topics, based on this extensive survey and review. The research endeavors to contribute state-of-the-art models, innovative methods, and curated data sets to related sub-fields, thereby propelling a step towards enhancing intelligent perception for robots. Besides, this review is also expected to furnish relevant researchers in the field with a lucid understanding of the field to guide future research and development. Date: Thursday, 25 April 2024 Time: 3:00pm - 5:00pm Venue: Room 5510 Lifts 25/26 Committee Members: Dr. Qifeng Chen (Supervisor) Dr. Yangqiu Song (Chairperson) Dr. Long Chen Dr. Dan Xu