More about HKUST
Spatial Temporal Feature Processing in 3D Point Clouds for Autonomous Driving
PhD Thesis Proposal Defence Title: "Spatial Temporal Feature Processing in 3D Point Clouds for Autonomous Driving" by Mr. Sukai WANG Abstract: In past decades, the 3D point cloud data has been one of the most important data types across a variety of domains, such as robotics, architecture, and especially in the autonomous driving system (ADS), because of its robustness and spatial accuracy. Com- pared with the 2D image, point clouds have accurate location and geometry information, and in real driving scenarios, the short distance between two adjacent point clouds in data stream brings abundant temporal redundancy. To better understand point cloud data in the big data era, machine learning-based approaches gradually occupied a dominant position. Thus, in this proposal, we focus on the point cloud’s spatial and temporal feature extraction and processing in learning-based methods. In ADS, point clouds will be used widely in perception, localization, and mapping. And point cloud collection, storage, and transmission are the predecessor of those downstream tasks. This thesis explores several feature extraction networks in two different stages with suitable data representation types: point cloud compression in range image and multiple object detection and tracking in spatio-temporal map. In end-to-end point cloud compression, we first prove that our proposed range image- based compression framework with the entropy model is better than the state-of-the-art octree-based methods. Then, we introduce a hybrid point cloud sequence compression framework, which consists of a static and a dynamic point cloud compression algorithm. In the static compression framework, a geometry-aware attention layer is used to remove spatial redundancy. In the dynamic compression framework, the conv-LSTM with GHU module is used for temporal redundancy removal. In the downstream task, 3D multiple object detection and tracking, we propose an end-to-end point cloud-based network, DiT- Net, to directly assign a track ID to each object across the whole sequence, without the data association step. DiTNet is made location-invariant by using relative location and embeddings to learn each object’s spatial and temporal features in the Spatio-temporal world. The features from the detection module help to improve the tracking performance, and the tracking module with final trajectories also helps to refine the detection results. Lastly, we will conclude and explore the remaining problem for further research in the 3D object detection refinement under a priori tracking results in spatio-temporal world, and explore how to choose the best feature extraction module in a long-term frame sequence in all spatio-temporal frameworks. Date: Friday, 19 August 2022 Time: 10:00am - 12:00noon Zoom Meeting: https://hkust.zoom.us/j/95872323558?pwd=c003RHNWdjR3WkJJelJJd2VNN01WQT09 Committee Members: Dr. Ming Liu (Supervisor) Prof. Cunsheng Ding (Chairperson) Dr. Qifeng Chen Prof. Ling Shi (ECE) **** ALL are Welcome ****