Mitigating Data Imperfection for 3D Perception in Autonomous Driving

The Hong Kong University of Science and Technology
Department of Computer Science and Engineering


PhD Thesis Defence


Title: "Mitigating Data Imperfection for 3D Perception in Autonomous Driving"

By

Mr. Qing LIAN


Abstract

3D perception is a fundamental component in many robotic applications, enabling 
the localization of surrounding obstacles and road elements for downstream 
planning. Currently, neural-network-based approaches have dominated this area, 
but these methods are often data-hungry. However, impacted by the imperfect 
camera sensor, complicated 3D reasoning, and cumbersome 3D data annotations, 
the data scale and quality are often insu cient for training a robust 3D 
perception model. To address these challenges, this thesis proposes a series of 
methods to mitigate data imperfections and enhance the applicability of 3D 
perception algorithms.

To increase the scalability of training data, we rst design a series of 
geometry-aware data augmentation techniques to generate additional training 
data. The design of geometry manipulation in augmentation is guided by an 
investigation of how do detectors estimate object depth, hence the generated 
data is meaningful for model training. To utilize abundant additional unlabeled 
data from other sensors such as stereo cameras, and LiDAR, we further design 
two training frameworks to improve the monocular 3D detection performance. We 
rst design a multi-view semi-supervised training framework to generate 
additional unsupervised training signals by cross-view consistency. With 
additional point cloud data during training, we further utilize it to teach the 
model to reconstruct object shapes. Then, the ill-posed depth estimation 
problem from the imperfect sensor is alleviated by the proposed 
uncertainty-aware cost volume built via reconstruction error. To improve the 
quality of labeled data, we design a learning-based active learning algorithm 
that can automatically nd the informative data to improve model performance. 
Lastly, to alleviate the distribution shifts across sensors and environments, 
we explore monocular 3D object detection in the wild by designing a robust and 
camera-irrelevant 3D bounding boxes reasoning paradigm.


Date:			Friday, 13 October 2023

Time:			3:00pm - 5:00pm

Venue:			Room 5510
 			Lifts 25/26

Chairman:		Prof. Eric MARBERG (MATH)

Committee Members:	Prof. Tong ZHANG (Supervisor)
 			Prof. Dan XU
 			Prof. Xiaofang ZHOU
 			Prof. Ping TAN (ECE)
 			Prof. Li ZHANG (Fudan University)


**** ALL are Welcome ****