More about HKUST
Spatial-temporal Feature Processing in 3D Point Clouds for Autonomous Driving
PhD Thesis Defence Title: "Spatial-temporal Feature Processing in 3D Point Clouds for Autonomous Driving" By Mr. Sukai WANG Abstract In past decades, the 3D point cloud data has been one of the most important data types across a variety of domains, such as robotics, architecture, and especially in the autonomous driving assistance system (ADAS) because of its robustness and spatial accuracy. To better understand point cloud data in the big data era, machine learning-based approaches gradually occupied a dominant position. Thus, in this thesis, I focus on the point cloud sequences’ spatial and temporal feature extraction and processing in learning-based methods. In ADS, point clouds will be used widely in perception, localization, and mapping. And at the beginning, point cloud collection, storage, and transmission are the predecessor of those downstream tasks. This thesis explores several feature extraction networks in two different directions: point cloud compression and multiple object detection and tracking. The perception task can be seen as one of the downstream tasks of the data compression. In end-to-end point cloud compression, I first propose a baseline range image-based method to prove that the range image-based compression framework is better than the octree-based methods for scanning LiDARs in autonomous driving. Then, motivated by video compression, I introduce a hybrid point cloud sequence compression framework, which consists of a static and a dynamic learning-based point cloud compression algorithm. In the static compression framework, a geometry-aware attention layer helps remove spatial redundancy. In the dynamic compression framework, the conv-LSTM with GHU module is used for temporal redundancy removal. And in the downstream task, 3D multiple object detection and tracking, I first propose a ”fake” end-to-end tracking-withdetection framework by predicting the objects’ movement to improve the data association accuracy. Then I introduce a ”real” end-to-end MOT network, ST-TrackNet, which rearranges the object detections in a Spatio-temporal map and then directly predicts the object track ID without the data association step. Based on the above research, I propose DiTNet, which integrates a detection module with the tracking network. The features from the detection module help to improve the tracking performance, and the tracking module with final trajectories also helps to refine the detection results. Lastly, I summarize this thesis and propose future research opportunities. Date: Monday, 5 December 2022 Time: 10:00am - 12:00noon Venue: Room 4472 lifts 25/26 Chairperson: Prof. Maosheng XIONG (MATH) Committee Members: Prof. Ming LIU (Supervisor) Prof. Qifeng CHEN Prof. Cunsheng DING Prof. Ling SHI (ECE) Prof. Hesheng WANG (Shanghai Jiao Tong University) **** ALL are Welcome ****