More about HKUST
Towards Practical Volumetric Scene Representation for Novel View Synthesis
The Hong Kong University of Science and Technology
Department of Computer Science and Engineering
PhD Thesis Defence
Title: "Towards Practical Volumetric Scene Representation for Novel View
Synthesis"
By
Mr. Li MA
Abstract:
Given some sparse observations of a 3D scene, novel view synthesis (NVS) tries
to reconstruct arbitrary views of the scene. It has many practical
applications, such as virtual reality, online conferences, and 3D content
creation. A common paradigm for NVS is to reconstruct a 3D representation,
which could then be rendered to generate novel views. The classical approaches
rely on off-the-shelf structure from motion (SfM) algorithms to reconstruct
mesh or point cloud, which tends to fail in challenging cases like complex
geometry and strong view-dependent effect. Recently, volumetric
representations, such as Multiplane Images (MPI) and Neural Radiance Field
(NeRF), have revolutionized the NVS in generating indistinguishable realistic
images. However, there are still some open questions for volume-based methods.
For instance, the reconstruction process is prone to errors when dealing with
non-ideal capturing conditions, such as blurriness. And 3D volumes are
difficult to edit, hindering the ability to make post-production modifications.
Besides, they are inefficient in rendering, especially for dynamic scenes,
eliminating any interactive applications.
This dissertation addresses several limitations of volumetric representations
and propose improvements to enhance several practical applications. We first
propose a method for restoring a sharp NeRF from blurry observations. We show
that the straightforward reconstruction approach produces significant artifacts
under blurry settings, and that jointly optimizing the blur kernel and the NeRF
leads to automatic decomposition of the blurring parameters and scene content,
resulting in a sharp NeRF. To address the poor editability, we take inspiration
from explicit representations that support intuitive editing, such as mesh. We
present Neural Parameterization (NeP), a hybrid representation that provides
the advantages of both implicit and explicit methods. It is capable of
photorealistic rendering while allowing fine-grained editing of the scene
geometry and appearance of a dense 3D volume. Finally, we extend existing
static representation to model 3D video loops. We develop an efficient 3D video
representation, namely Multi-Tile Video (MTV), by exploiting the
spatio-temporal sparsity, which achieves rendering in real-time, even in mobile
devices. We also propose a novel looping loss based on video temporal
retargeting, which generates photorealistic looping MTVs given only
asynchronous multi-view videos. We conduct extensive experiments to
qualitatively and quantitatively evaluate all the improvements we propose.
Results indicate that our methods have shown promise in improving the
robustness, editability, and efficiency of volumetric representation.
Date: Thursday, 9 November 2023
Time: 2:00pm - 4:00pm
Venue: Room 5562
Lifts 27/28
Chairman: Prof. Abhishek SRIVASTAVA (ECE)
Committee Members: Prof. Pedro SANDER (Supervisor)
Prof. Qifeng CHEN
Prof. Long QUAN
Prof. Ping TAN (ECE)
Prof. Hongbo FU (CityU)
**** ALL are Welcome ****