Urban Scene Segmentation, Recognition and Remodeling

PhD Thesis Proposal Defence


Title: "Urban Scene Segmentation, Recognition and Remodeling"

by

Miss Jinglu WANG


Abstract:

The thriving large-scale 3D reconstruction techniques make city-scale models 
available and find numerous applications in 3D mapping and reverse engineering. 
In this thesis, we focus on urban scene segmentation, recognition and 
remodeling for the sake of understanding the modeled cityscape and in turn 
refining the urban models.

We first generate joint semantic segmentation for urban images and 3D data 
efficiently by surmounting the barrier of time-consuming manual annotation of 
training data. With the reconstructed 3D geometry, the training data are 
initialized resorting to urban priors. We segment the input images and 3D data 
into semantic categories simultaneously by employing a novel joint object 
correspondence graph which purifies the automatically generated training 
samples.

Our second task is to recognize and segment individual building objects. We 
extract building objects from the orthographic view and then fuse the 
decomposed roof segmentation to 3D models through a structure-aware flooding 
algorithm. Each building footprint is abstracted by a set of structural 
primitives that are best fit to the model geometry and also conform to the 
discovered global regularities. We extend our patchwork assembly to recognize 
more detailed but structural objects in architectures, such as windows, and 
then reassemble them based on the recognized grammars. The facade structure is 
recognized at the object level with a structure-driven Monte Carlo Markov Chain 
sampler. The solution space is explored with high efficiency because the 
structure-driven sampler accelerates convergence by utilizing the 
repetitiveness priors.

Semantic information is able to improve the robustness and accuracy of the 
initially reconstructed models. It can enhance building model regularities and 
provide plausible tree model completion using rules. The building 
regularization is achieved by leveraging a set of structural linear features. 
We propose reliable linear features from images, triangulate them in space, and 
infer their spatial relations with a non-linear programing method. The poses of 
the linear features are adjusted to satisfy the inferred relations in a 
least-square manner, followed by a smooth deformation of the entire mesh 
geometry. Furthermore, we propose an approach for automatically generating 3D 
models of natural-looking and photorealistic palm trees, which is seldom 
investigated. Given a palm tree image and its segmentation, we extract the 2D 
tree structure using an edge tracing method and construct the 3D structure by 
estimating depths through a fast force-based approach. Textures are obtained by 
minimizing the photometric projection error.

In this thesis, we transfer the 3D reconstruction from a pure geometry-based 
representation to a semantic representation at the object level on which 
high-level applications can be built, and remodel the object with specific 
semantic meanings to be visually pleasing and computationally compact. We 
demonstrate our methods on a few large and challenging datasets.


Date:			Thursday, 21 April 2016

Time:                  	3:00pm - 5:00pm

Venue:                  Room 5510
                         (lifts 25/26)

Committee Members:	Prof. Long Quan (Supervisor)
  			Dr. Pedro Sander (Chairperson)
 			Prof. Albert Chung
  			Prof. James Kwok


**** ALL are Welcome ****