More about HKUST
3D Perception and Motion Prediction with Point Cloud Learning in Autonomous Driving
The Hong Kong University of Science and Technology
Department of Computer Science and Engineering
PhD Thesis Defence
Title: "3D Perception and Motion Prediction with Point Cloud Learning in
Autonomous Driving"
By
Mr. Maosheng YE
Abstract:
3D perception system is an essential component of robotics, especially for
autonomous driving systems. 3D segmentation and motion prediction are crucial
subtasks in the perception system, which provide fine-grained scene
understanding and forecasting. The point cloud is the primary data structure
when dealing with 3D segmentation and 3D object detection in perception. Many
point cloud processing algorithms are proposed for fine-grained LiDAR
segmentation based on different representations. However, different
representations have their own pros and cons. Thus, multi-representation
learning is a common framework to fuse the merits of multiple representations
in order to achieve the balance among performance, efficiency, and memory
usage. While the goal is direct and clear, finding a better and more efficient
way to design a multi- representation framework is still challenging since it
is related to the point cloud properties, including sparsity, irregularity, and
the number of points when dealing with autonomous driving scenarios.
This thesis aims to study the multi-representation point cloud learning in the
3D perception system to design an efficient network structure for demanding
applications. For LiDAR segmentation, we utilize point representation and voxel
representation in a unified and efficient manner. Hierarchical learning is
proposed both in pointwise and voxelwise learning branches. Furthermore, we
propose the voxel-as-point principle better to exploit the sparsity and
scale-invariant in the point cloud to save the memory cost brought by point
representation. We design an attentive scale-selection layer based on an
attention mechanism capable of fusing multi-scale information.
Besides that, we also extend these networks to the downstream task motion
predictions, which also process the sparse and structural data input that can
be viewed as a special kind of temporal point cloud, namely TPCN. We are the
first work that combines point cloud learning with motion forecasting. For
enhancing spatial-temporal robustness under slight disturbance, we propose Dual
Consistency Constraints that regularize the predicted trajectories under
perturbation during training. We extensively study the efficacy of Dual
Consistency Constraints in other state-of-the-art methods and demonstrate its
effectiveness as a plug-in component.
Date: Thursday, 11 April 2024
Time: 4:30pm - 6:30pm
Venue: Room 5501
Lifts 25/26
Chairman: Prof. Chik Patrick YUE (ECE)
Committee Members: Prof. Qifeng CHEN (Supervisor)
Prof. Dan XU
Prof. Pedro SANDER
Prof. Ling SHI (ECE)
Prof. Dong XU (HKU)