Towards Practical Volumetric Scene Representation for Novel View Synthesis

The Hong Kong University of Science and Technology
Department of Computer Science and Engineering


PhD Thesis Defence


Title: "Towards Practical Volumetric Scene Representation for Novel View 
Synthesis"

By

Mr. Li MA


Abstract:

Given some sparse observations of a 3D scene, novel view synthesis (NVS) tries 
to reconstruct arbitrary views of the scene. It has many practical 
applications, such as virtual reality, online conferences, and 3D content 
creation. A common paradigm for NVS is to reconstruct a 3D representation, 
which could then be rendered to generate novel views. The classical approaches 
rely on off-the-shelf structure from motion (SfM) algorithms to reconstruct 
mesh or point cloud, which tends to fail in challenging cases like complex 
geometry and strong view-dependent effect. Recently, volumetric 
representations, such as Multiplane Images (MPI) and Neural Radiance Field 
(NeRF), have revolutionized the NVS in generating indistinguishable realistic 
images. However, there are still some open questions for volume-based methods. 
For instance, the reconstruction process is prone to errors when dealing with 
non-ideal capturing conditions, such as blurriness. And 3D volumes are 
difficult to edit, hindering the ability to make post-production modifications. 
Besides, they are inefficient in rendering, especially for dynamic scenes, 
eliminating any interactive applications.

This dissertation addresses several limitations of volumetric representations 
and propose improvements to enhance several practical applications. We first 
propose a method for restoring a sharp NeRF from blurry observations. We show 
that the straightforward reconstruction approach produces significant artifacts 
under blurry settings, and that jointly optimizing the blur kernel and the NeRF 
leads to automatic decomposition of the blurring parameters and scene content, 
resulting in a sharp NeRF. To address the poor editability, we take inspiration 
from explicit representations that support intuitive editing, such as mesh. We 
present Neural Parameterization (NeP), a hybrid representation that provides 
the advantages of both implicit and explicit methods. It is capable of 
photorealistic rendering while allowing fine-grained editing of the scene 
geometry and appearance of a dense 3D volume. Finally, we extend existing 
static representation to model 3D video loops. We develop an efficient 3D video 
representation, namely Multi-Tile Video (MTV), by exploiting the 
spatio-temporal sparsity, which achieves rendering in real-time, even in mobile 
devices. We also propose a novel looping loss based on video temporal 
retargeting, which generates photorealistic looping MTVs given only 
asynchronous multi-view videos. We conduct extensive experiments to 
qualitatively and quantitatively evaluate all the improvements we propose. 
Results indicate that our methods have shown promise in improving the 
robustness, editability, and efficiency of volumetric representation.


Date:			Thursday, 9 November 2023

Time:			2:00pm - 4:00pm

Venue:			Room 5562
                         Lifts 27/28

Chairman:		Prof. Abhishek SRIVASTAVA (ECE)

Committee Members:	Prof. Pedro SANDER (Supervisor)
  			Prof. Qifeng CHEN
  			Prof. Long QUAN
 			Prof. Ping TAN (ECE)
 			Prof. Hongbo FU (CityU)


**** ALL are Welcome ****