Robust Registration and Semantic Understanding of 3D Point Clouds

PhD Thesis Proposal Defence


Title: "Robust Registration and Semantic Understanding of 3D Point
Clouds"

by

Mr. Xuyang BAI


Abstract:

In recent decades, with the prevalence of affordable RGB-D cameras and 
LiDAR (Light Detection and Ranging) scanners, point cloud representation 
has become increasingly practical and popular in many computer vision 
applications such as structure from motion (SfM), simultaneous 
localization and mapping (SLAM). Since a collection of point clouds is 
usually limited to one perspective, several acquisitions are required to 
cover the whole area of interest. In order to register these acquisitions 
from different viewpoints, point cloud registration is applied as a 
fundamental step for finding an optimal transformation between partially 
overlapped point cloud fragments and recovering a complete underlying 
geometry. Subsequently, with the reconstructed 3D model represented as 
point clouds, performing semantic scene understanding is necessary for 
many applications such as autonomous driving and augmented reality. In 
this thesis, we present our efforts in studying and contributing to these 
two problems, namely, point cloud registration and point cloud semantic 
understanding.

First, we present methods for efficient and robust point cloud 
registration. Specifically, we decompose the point cloud registration 
pipeline into three learnable sub-modules. In the first method, we design 
1) a keypoint detector and 2) a keypoint descriptor for efficient local 
feature extraction, where we demonstrate the superiority of joint learning 
of both detection and description tasks. In xiii the second method, we 
propose a 3) correspondence filtering sub-module to improve the robustness 
towards large outlier ratio. We explicitly incorporate the spatial 
consistency constrained by rigid transformations for pruning outlier 
correspondences.

Second, we study the problem of semantic understanding on the registered 
point clouds. Specifically, we propose a LiDAR-camera fusion solution for 
3D perception. Our studies investigate the inherent difficulties of 
LiDAR-camera fusion and reveal a crucial aspect to robust fusion, namely, 
the soft-association mechanism. The proposed module is integrated into 
object detection and multiple object tracking frameworks and can be easily 
extended to other tasks such as LiDAR semantic segmentation.

In summary, we developed several learning-based methods for robust point 
cloud registration as well as semantic parsing on the registered point 
clouds. The proposed methods have been extensively evaluated on 
standardized benchmarks, where superior performance and strong 
generalization ability have been demonstrated.


Date:			Thursday, 7 April 2022

Time:                  	2:00pm - 4:00pm

Zoom Meeting:		https://hkust.zoom.us/j/3966929732

Committee Members:	Prof. Chiew-Lan Tai (Supervisor)
  			Prof. Chi-Keung Tang (Chairperson)
 			Dr. Qifeng Chen
 			Prof. Pedro Sander


**** ALL are Welcome ****