More about HKUST
Urban Scene Parsing with Images and Scan Data
The Hong Kong University of Science and Technology
Department of Computer Science and Engineering
PhD Thesis Defence
Title: "Urban Scene Parsing with Images and Scan Data"
By
Mr. Honghui Zhang
Abstract
Urban scene parsing, segmenting interested objects and identifying their
categories in urban scenes, is a fundamental issue for the urban scene
understanding. As a representative of the constrained scene parsing task,
it is closely related to many important applications paid great attention
recently, like 3D city modeling and autonomous vehicles navigation. In
this thesis, we investigate the methodology for the urban scene parsing
task with images and scan data, as well as the parameter learning of
random field models which are wide used to formulate various scene parsing
tasks.
With both images and scan data, we propose a novel joint image and scan
data scene parsing system which can be applied in large scale urban
scenes. The proposed system can automatically obtain necessary training
data from the input data, which is usually obtained through manually
labeling in previous work. Then, an associative Hierarchical CRF trained
with the automatically obtained training data is adopted to jointly
segment images and scan point cloud by integrating both 3D geometry
information and 2D image appearance information. For the urban image
parsing, we propose a nonparametric scene parsing method which exploits
the partial similarity between images, and a parametric scene parsing
method, the supervised label transfer method. The partial similarity based
nonparametric method involves no training process and reduces the
inference problem in the scene parsing to a matching problem. By contrast,
the supervised label transfer method transforms the inference problem in
the scene parsing to a supervised matching problem, inheriting the
advantages of the nonparametric scene parsing methods. These proposed
methods are evaluated and compared with some state-of-the-art methods on
several public datasets and the real Google Street View data, with
encouraging performance achieved.
Last but not least, we propose an adaptive discriminative learning
algorithm to learn the parameters of the random field models which are
widely used to formulate various scene parsing tasks, from some given
training data. The parameters are iteratively updated with an adaptive
updating step-size by solving a structured prediction problem, sharing
similar updating form as the Projected Subgradient method. The proposed
adaptive discriminative learning algorithm can achieve comparable
performance as the classical StructSVM method and Projected Subgradient
method, with significantly improved learning efficiency.
Date: Friday, 3 August 2012
Time: 3:00pm – 5:00pm
Venue: Room 3501
Lifts 25/26
Chairman: Prof. Li Qiu (ECE)
Committee Members: Prof. Long Quan (Supervisor)
Prof. Helen Shen
Prof. Dit-Yan Yeung
Prof. Kai Tang (MECH)
Prof. Edmond Boyer (INRIA, France)
**** ALL are Welcome ****