3D Semantic Segmentation of Indoor and Outdoor Scenes

The Hong Kong University of Science and Technology
Department of Computer Science and Engineering


PhD Thesis Defence


Title: "3D Semantic Segmentation of Indoor and Outdoor Scenes"

By

Mr. Zeyu HU


Abstract

3D semantic segmentation is an indispensable cornerstone for thorough 3D 
scene understanding, and it faces different challenges in indoor and 
outdoor scenes due to their respective characteristics. In indoor scenes, 
objects are densely placed and have various structures, which brings two 
major challenges for 3D semantic segmentation: 1. how to generate accurate 
and clear segmentation boundaries; 2. how to extract surface information 
from complex and irregular geometries. Compared with indoor scenes, 
outdoor scenes have much larger scanning ranges perceiving millions of 
points, posing a fundamental question for 3D semantic segmentation: how to 
effectively label outdoor scene datasets. This thesis presents three 
methods that aim to address the above challenges.

To address the first challenge in indoor scenes, we introduce the task of 
semantic edge detection to the 3D field. It serves as the dual task of 3D 
semantic segmentation and focuses on the segmentation boundaries. We adopt 
the idea of complementary learning and present JSENet, a novel joint 
learning framework that brings significant improvements to the 
segmentation boundaries of indoor scenes by explicitly exploiting the 
duality between the two tasks. Further, to address the second challenge of 
extracting surface information from complex and irregular geometries of 
objects in indoor scenes, we adopt the often-overlooked mesh 
representation in which valuable geodesic information of geometric 
surfaces is naturally embedded. We propose VMNet, a novel deep 
architecture that operates on voxel and mesh representations 
simultaneously. By leveraging both the Euclidean information embedded in 
voxels and the geodesic information embedded in meshes, for indoor scenes, 
we develop a geodesic-aware 3D semantic segmentation method that generates 
accurate segmentation results on complex geometries. Finally, to address 
the third challenge in outdoor scenes, we study the task of 
label-efficient 3D semantic segmentation. Outdoor scenesĀ are generally 
captured as continuous LiDAR frame sequences containing a large number of 
points that are expensive to label and informatively redundant. We propose 
to utilize inter-frame correlation to tackle the information redundancy 
problem in these LiDAR frames. By estimating model uncertainty based on 
the inconsistency of predictions across these continuous frames, we design 
LiDAL, a novel active learning strategy for 3D LiDAR semantic segmentation 
of outdoor scenes, which significantly reduces label annotation costs. We 
conduct extensive experiments to demonstrate the effectiveness of our 
methods.


Date:			Wednesday, 7 December 2022

Time:			2:00pm - 4:00pm

Venue:			Room 3494
 			lifts 25/26

Chairperson:		Prof. Richard LAKERVELD (CBE)

Committee Members:	Prof. Chiew Lan TAI (Supervisor)
 			Prof. Qifeng CHEN
 			Prof. Long QUAN
 			Prof. Weichuan YU (ECE)
 			Prof. Chi Wing FU (CUHK)


**** ALL are Welcome ****