Geometry Inference from Different Modalities: Videos, Polarization Images, and Portrait Images

PhD Thesis Proposal Defence


Title: "Geometry Inference from Different Modalities: Videos, Polarization 
Images, and Portrait Images"

by

Miss Jiaxin XIE


Abstract:

Learning Geometry from a single image has been a long-standing and challenging 
problem. Single image methods heavily rely on learned image priors, which may 
not generalize well to unseen scenes. This thesis explores alternative 
methodologies that incorporate additional information from diverse modalities 
to enhance the understanding of 3D structures.

In Chapter 2, we propose a novel approach that leverages video frames extracted 
from monocular videos. By solving the triangulation problem between two video 
frames, initial depth estimates are obtained. This temporal context enhances 
the accuracy and robustness of depth estimation, enabling a more comprehensive 
reconstruction of the underlying 3D geometry.

In Chapter 3, we introduce the utilization of polarization images to aid in 
normal estimation for complex scenes. Polarization images capture distinct 
changes in light polarization as it interacts with surfaces of different shapes 
and materials. By analyzing polarization cues, dense surface orientation 
information is extracted, facilitating accurate estimation of surface normals.

In Chapter 4, we leverage a pre-trained 3D-aware portrait images generation 
model to aid in depth estimation. The pre-trained model exhibits a strong 
ability to generate multi-view portrait images. Exploiting this 3D-aware 
generation capability, we utilize the model to infer depth from a single input 
image. The estimated depth information is then employed to warp pseudo views, 
effectively addressing the challenging geometry-texture trade-off encountered 
in 3D inversion tasks.

Collectively, this thesis contributes to the advancement of learning 3D from 
single images by incorporating information from different modalities, including 
videos, polarization images, and portrait images. The proposed methodologies 
overcome limitations of naive single image approaches.


Date:			Thursday, 12 October 2023

Time:                  	10:00am - 12:00noon

Venue:                  Room 2126D
                         lift 19

Committee Members:	Dr. Qifeng Chen (Supervisor)
 			Dr. Dan Xu (Chairperson)
 			Prof. Pedro Sander
 			Dr. Sai-Kit Yeung


**** ALL are Welcome ****