Basic level scene understanding: unifying recognition and reconstruction

===================================================================
                Graphic Group Seminar
===================================================================
Department of Computer Science and Engineering
Center of Visual Computing and Image Science
-------------------------------------------------------------------

Speaker:        Jianxiong Xiao
                Massachusetts Institute of Technology (MIT).

Title:          "Basic level scene understanding: unifying recognition
                 and reconstruction"

Date:           Monday, 26 Nov 2012

Time:           4-5pm

Venue:          Rm 4204 (Graphic Lab), via lifts 19/20
                HKUST

Abstract:

An early goal of computer vision was to build a system that
could automatically understand a scene, not only extracting 3D
information but also inferring the semantics for a large variety of
different environments. In this talk, I will summarize our recent
efforts to unify recognition and reconstruction to reach a more
complete understanding of a scene, by leveraging a huge amount of
data. First, I will describe the SUN database, a collection of
annotated images that exhaustively spans common scene categories. This
database allows us to systematically study the space of everyday
scenes and to establish a benchmark for both scene and object
recognition. I will also talk about ways of coping with the wide
variety of viewpoints within these scenes. We propose the scene
viewpoint recognition task, the goal of which is to recognize the
observer?s viewpoint within a place category. For this, we introduce a
database of 360-degree panoramic images and an algorithm that
simultaneously trains a viewpoint classifier and aligns panoramas.
Finally, I will describe steps toward unified 3D scene parsing: (i)
localizing geometric primitives in images, such as cuboids and
cylinders, which comprise many everyday objects, (ii) extracting the
3D structure of the scene and objects depicted in an image, and (iii)
creating a complete place-centric representation of 3D space.

*************
Biography:

Jianxiong Xiao is a Ph.D. candidate working with Antonio Torralba in the
Computer Science and Artificial Intelligence Laboratory (CSAIL) at
Massachusetts Institute of Technology (MIT). After receiving a B.Eng. from
the Hong Kong University of Science and Technology (HKUST), he received a
M.Phil. in Computer Science while working with Long Quan. His research
interests are in computer vision, with a focus on scene understanding.
Jianxiong won the best student paper award at the European Conference on
Computer Vision (ECCV) and received the Google U.S./Canada Ph.D.
Fellowship in Computer Vision. More information can be found on his
website: http://mit.edu/jxiao.