More about HKUST
Urban Scene Segmentation, Recognition and Remodeling
PhD Thesis Proposal Defence Title: "Urban Scene Segmentation, Recognition and Remodeling" by Miss Jinglu WANG Abstract: The thriving large-scale 3D reconstruction techniques make city-scale models available and find numerous applications in 3D mapping and reverse engineering. In this thesis, we focus on urban scene segmentation, recognition and remodeling for the sake of understanding the modeled cityscape and in turn refining the urban models. We first generate joint semantic segmentation for urban images and 3D data efficiently by surmounting the barrier of time-consuming manual annotation of training data. With the reconstructed 3D geometry, the training data are initialized resorting to urban priors. We segment the input images and 3D data into semantic categories simultaneously by employing a novel joint object correspondence graph which purifies the automatically generated training samples. Our second task is to recognize and segment individual building objects. We extract building objects from the orthographic view and then fuse the decomposed roof segmentation to 3D models through a structure-aware flooding algorithm. Each building footprint is abstracted by a set of structural primitives that are best fit to the model geometry and also conform to the discovered global regularities. We extend our patchwork assembly to recognize more detailed but structural objects in architectures, such as windows, and then reassemble them based on the recognized grammars. The facade structure is recognized at the object level with a structure-driven Monte Carlo Markov Chain sampler. The solution space is explored with high efficiency because the structure-driven sampler accelerates convergence by utilizing the repetitiveness priors. Semantic information is able to improve the robustness and accuracy of the initially reconstructed models. It can enhance building model regularities and provide plausible tree model completion using rules. The building regularization is achieved by leveraging a set of structural linear features. We propose reliable linear features from images, triangulate them in space, and infer their spatial relations with a non-linear programing method. The poses of the linear features are adjusted to satisfy the inferred relations in a least-square manner, followed by a smooth deformation of the entire mesh geometry. Furthermore, we propose an approach for automatically generating 3D models of natural-looking and photorealistic palm trees, which is seldom investigated. Given a palm tree image and its segmentation, we extract the 2D tree structure using an edge tracing method and construct the 3D structure by estimating depths through a fast force-based approach. Textures are obtained by minimizing the photometric projection error. In this thesis, we transfer the 3D reconstruction from a pure geometry-based representation to a semantic representation at the object level on which high-level applications can be built, and remodel the object with specific semantic meanings to be visually pleasing and computationally compact. We demonstrate our methods on a few large and challenging datasets. Date: Thursday, 21 April 2016 Time: 3:00pm - 5:00pm Venue: Room 5510 (lifts 25/26) Committee Members: Prof. Long Quan (Supervisor) Dr. Pedro Sander (Chairperson) Prof. Albert Chung Prof. James Kwok **** ALL are Welcome ****