More about HKUST
Mitigating Data Imperfection for 3D Perception in Autonomous Driving
The Hong Kong University of Science and Technology Department of Computer Science and Engineering PhD Thesis Defence Title: "Mitigating Data Imperfection for 3D Perception in Autonomous Driving" By Mr. Qing LIAN Abstract 3D perception is a fundamental component in many robotic applications, enabling the localization of surrounding obstacles and road elements for downstream planning. Currently, neural-network-based approaches have dominated this area, but these methods are often data-hungry. However, impacted by the imperfect camera sensor, complicated 3D reasoning, and cumbersome 3D data annotations, the data scale and quality are often insu cient for training a robust 3D perception model. To address these challenges, this thesis proposes a series of methods to mitigate data imperfections and enhance the applicability of 3D perception algorithms. To increase the scalability of training data, we rst design a series of geometry-aware data augmentation techniques to generate additional training data. The design of geometry manipulation in augmentation is guided by an investigation of how do detectors estimate object depth, hence the generated data is meaningful for model training. To utilize abundant additional unlabeled data from other sensors such as stereo cameras, and LiDAR, we further design two training frameworks to improve the monocular 3D detection performance. We rst design a multi-view semi-supervised training framework to generate additional unsupervised training signals by cross-view consistency. With additional point cloud data during training, we further utilize it to teach the model to reconstruct object shapes. Then, the ill-posed depth estimation problem from the imperfect sensor is alleviated by the proposed uncertainty-aware cost volume built via reconstruction error. To improve the quality of labeled data, we design a learning-based active learning algorithm that can automatically nd the informative data to improve model performance. Lastly, to alleviate the distribution shifts across sensors and environments, we explore monocular 3D object detection in the wild by designing a robust and camera-irrelevant 3D bounding boxes reasoning paradigm. Date: Friday, 13 October 2023 Time: 3:00pm - 5:00pm Venue: Room 5510 Lifts 25/26 Chairman: Prof. Eric MARBERG (MATH) Committee Members: Prof. Tong ZHANG (Supervisor) Prof. Dan XU Prof. Xiaofang ZHOU Prof. Ping TAN (ECE) Prof. Li ZHANG (Fudan University) **** ALL are Welcome ****