More about HKUST
Urban Scene Segmentation, Recognition and Remodeling
PhD Thesis Proposal Defence
Title: "Urban Scene Segmentation, Recognition and Remodeling"
by
Miss Jinglu WANG
Abstract:
The thriving large-scale 3D reconstruction techniques make city-scale models
available and find numerous applications in 3D mapping and reverse engineering.
In this thesis, we focus on urban scene segmentation, recognition and
remodeling for the sake of understanding the modeled cityscape and in turn
refining the urban models.
We first generate joint semantic segmentation for urban images and 3D data
efficiently by surmounting the barrier of time-consuming manual annotation of
training data. With the reconstructed 3D geometry, the training data are
initialized resorting to urban priors. We segment the input images and 3D data
into semantic categories simultaneously by employing a novel joint object
correspondence graph which purifies the automatically generated training
samples.
Our second task is to recognize and segment individual building objects. We
extract building objects from the orthographic view and then fuse the
decomposed roof segmentation to 3D models through a structure-aware flooding
algorithm. Each building footprint is abstracted by a set of structural
primitives that are best fit to the model geometry and also conform to the
discovered global regularities. We extend our patchwork assembly to recognize
more detailed but structural objects in architectures, such as windows, and
then reassemble them based on the recognized grammars. The facade structure is
recognized at the object level with a structure-driven Monte Carlo Markov Chain
sampler. The solution space is explored with high efficiency because the
structure-driven sampler accelerates convergence by utilizing the
repetitiveness priors.
Semantic information is able to improve the robustness and accuracy of the
initially reconstructed models. It can enhance building model regularities and
provide plausible tree model completion using rules. The building
regularization is achieved by leveraging a set of structural linear features.
We propose reliable linear features from images, triangulate them in space, and
infer their spatial relations with a non-linear programing method. The poses of
the linear features are adjusted to satisfy the inferred relations in a
least-square manner, followed by a smooth deformation of the entire mesh
geometry. Furthermore, we propose an approach for automatically generating 3D
models of natural-looking and photorealistic palm trees, which is seldom
investigated. Given a palm tree image and its segmentation, we extract the 2D
tree structure using an edge tracing method and construct the 3D structure by
estimating depths through a fast force-based approach. Textures are obtained by
minimizing the photometric projection error.
In this thesis, we transfer the 3D reconstruction from a pure geometry-based
representation to a semantic representation at the object level on which
high-level applications can be built, and remodel the object with specific
semantic meanings to be visually pleasing and computationally compact. We
demonstrate our methods on a few large and challenging datasets.
Date: Thursday, 21 April 2016
Time: 3:00pm - 5:00pm
Venue: Room 5510
(lifts 25/26)
Committee Members: Prof. Long Quan (Supervisor)
Dr. Pedro Sander (Chairperson)
Prof. Albert Chung
Prof. James Kwok
**** ALL are Welcome ****