More about HKUST
Urban Scene Parsing with Images and Scan Data
The Hong Kong University of Science and Technology Department of Computer Science and Engineering PhD Thesis Defence Title: "Urban Scene Parsing with Images and Scan Data" By Mr. Honghui Zhang Abstract Urban scene parsing, segmenting interested objects and identifying their categories in urban scenes, is a fundamental issue for the urban scene understanding. As a representative of the constrained scene parsing task, it is closely related to many important applications paid great attention recently, like 3D city modeling and autonomous vehicles navigation. In this thesis, we investigate the methodology for the urban scene parsing task with images and scan data, as well as the parameter learning of random field models which are wide used to formulate various scene parsing tasks. With both images and scan data, we propose a novel joint image and scan data scene parsing system which can be applied in large scale urban scenes. The proposed system can automatically obtain necessary training data from the input data, which is usually obtained through manually labeling in previous work. Then, an associative Hierarchical CRF trained with the automatically obtained training data is adopted to jointly segment images and scan point cloud by integrating both 3D geometry information and 2D image appearance information. For the urban image parsing, we propose a nonparametric scene parsing method which exploits the partial similarity between images, and a parametric scene parsing method, the supervised label transfer method. The partial similarity based nonparametric method involves no training process and reduces the inference problem in the scene parsing to a matching problem. By contrast, the supervised label transfer method transforms the inference problem in the scene parsing to a supervised matching problem, inheriting the advantages of the nonparametric scene parsing methods. These proposed methods are evaluated and compared with some state-of-the-art methods on several public datasets and the real Google Street View data, with encouraging performance achieved. Last but not least, we propose an adaptive discriminative learning algorithm to learn the parameters of the random field models which are widely used to formulate various scene parsing tasks, from some given training data. The parameters are iteratively updated with an adaptive updating step-size by solving a structured prediction problem, sharing similar updating form as the Projected Subgradient method. The proposed adaptive discriminative learning algorithm can achieve comparable performance as the classical StructSVM method and Projected Subgradient method, with significantly improved learning efficiency. Date: Friday, 3 August 2012 Time: 3:00pm – 5:00pm Venue: Room 3501 Lifts 25/26 Chairman: Prof. Li Qiu (ECE) Committee Members: Prof. Long Quan (Supervisor) Prof. Helen Shen Prof. Dit-Yan Yeung Prof. Kai Tang (MECH) Prof. Edmond Boyer (INRIA, France) **** ALL are Welcome ****