URBAN SCENE SEGMENTATION, RECOGNITION AND REMODELING

The Hong Kong University of Science and Technology
Department of Computer Science and Engineering


PhD Thesis Defence


Title: "URBAN SCENE SEGMENTATION, RECOGNITION AND REMODELING"

By

Miss Jinglu WANG


Abstract

The thriving large-scale 3D reconstruction techniques make city-scale models 
available and find numerous applications in 3D mapping and reverse engineering. 
In this thesis, the focus is on urban scene segmentation, recognition and 
remodeling for the sake of understanding a modeled cityscape and in turn 
refining the urban models.

We first generate joint semantic segmentation for urban images and 3D data by 
surmounting the barrier of time-consuming manual annotation of training data. 
With the reconstructed 3D geometry, the training data are initialized resorting 
to urban priors. We segment the input images and 3D data into semantic 
categories simultaneously by employing a novel joint object correspondence 
graph which purifies the automatically generated training samples.

Our second task is to recognize and segment individual building objects. We 
extract building objects from the orthographic view and then fuse the 
decomposed roof segmentation to 3D models through a structure-aware flooding 
algorithm. Each building footprint is abstracted by a set of structural 
primitives that best fit to the model geometry and also conform to the 
discovered global regularities. We extend our patchwork assembly to recognize 
more detailed but structured objects among the architectures, such as windows, 
and then reassemble them based on recognized grammars. A facade structure is 
recognized at the object level with a structuredriven Monte Carlo Markov Chain 
sampler. The solution space is explored with high efficiency because the 
structure-driven sampler accelerates convergence by utilizing the 
repetitiveness priors.

Semantic information helps to improve the robustness and accuracy of the 
initially reconstructed models by enhancing building model regularities and 
providing plausible tree model completion. The building regularization is 
achieved by leveraging a set of structural linear features. We propose reliable 
linear features from images, triangulate them in space, and infer their spatial 
relations with a non-linear programing method. The poses of the linear features 
are adjusted to satisfy the inferred relations in a least-square manner, 
followed by a smooth deformation of the entire mesh geometry. Furthermore, an 
approach for automatically generating 3D models of natural-looking and 
photorealistic palm trees is proposed, which is seldom investigated in previous 
work. Given a palm tree image and its segmentation, we extract the 2D tree 
structure using an edge tracing method and construct the 3D structure by 
estimating depths through a fast force-based approach. Textures are obtained by 
minimizing the photometric projection error.

In this thesis, we transfer the 3D reconstruction from a pure geometry-based 
representation to a semantic representation at the object level on which 
high-level applications can be built, and remodel the object with specific 
semantic meanings to be visually pleasing and computationally compact. We 
demonstrate our methods on a few large and challenging datasets.


Date:			Thursday, 18 August 2016

Time:			3:00pm – 5:00pm

Venue:			Room 5564
 			Lifts 27/28

Chairman:		Prof. Howard Luong (ECE)

Committee Members:	Prof. Long Quan (Supervisor)
 			Prof. Huamin Qu
 			Prof. Chiew-Lan Tai
 			Prof. Ajay Joneja (IELM)
 			Prof. Jiaya Jia (Comp. Sci. & Engg., CUHK)


**** ALL are Welcome ****