More about HKUST
Towards Practical Volumetric Scene Representation for Novel View Synthesis
PhD Thesis Proposal Defence Title: "Towards Practical Volumetric Scene Representation for Novel View Synthesis" by Mr. Li MA Abstract: Given some sparse observations of a 3D scene, novel view synthesis (NVS) tries to reconstruct arbitrary views of the scene. It has many practical applications, such as virtual reality, online conferences, and 3D content creation. A common paradigm for NVS is to reconstruct a 3D representation, which could then be rendered to generate novel views. The classical approaches rely on off-the-shelf structure from motion (SfM) algorithms to reconstruct mesh or point cloud, which tends to fail in challenging cases like complex geometry and strong view-dependent effect. Recently, volumetric representations, such as Multiplane Images (MPI) and Neural Radiance Field (NeRF), have revolutionized the NVS in generating indistinguishable realistic images. However, there are still some open questions for volume-based methods. For instance, the reconstruction process is prone to errors when dealing with non-ideal capturing conditions, such as blurriness. And 3D volumes are difficult to edit, hindering the ability to make post-production modifications. Besides, they are inefficient in rendering, especially for dynamic scenes, eliminating any interactive applications. This dissertation addresses several limitations of volumetric representations and propose improvements to enhance several practical applications. We first propose a method for restoring a sharp NeRF from blurry observations. We show that the straightforward reconstruction approach produces significant artifacts under blurry settings, and that jointly optimizing the blur kernel and the NeRF leads to automatic decomposition of the blurring parameters and scene content, resulting in a sharp NeRF. To address the poor editability, we take inspiration from explicit representations that support intuitive editing, such as mesh. We present Neural Parameterization (NeP), a hybrid representation that provides the advantages of both implicit and explicit methods. It is capable of photorealistic rendering while allowing fine-grained editing of the scene geometry and appearance of a dense 3D volume. Finally, we extend existing static representation to model 3D video loops. We develop an efficient 3D video representation, namely Multi-Tile Video (MTV), by exploiting the spatio-temporal sparsity, which achieves rendering in real-time, even in mobile devices. We also propose a novel looping loss based on video temporal retargeting, which generates photorealistic looping MTVs given only asynchronous multi-view videos. We conduct extensive experiments to qualitatively and quantitatively evaluate all the improvements we propose. Results indicate that our methods have shown promise in improving the robustness, editability, and efficiency of volumetric representation. Date: Monday, 27 March 2023 Time: 2:00pm - 4:00pm Venue: Room 5510 lifts 25/26 Committee Members: Prof. Pedro Sander (Supervisor) Prof. Chiew-Lan Tai (Chairperson) Dr. Qifeng Chen Prof. Chi-Keung Tang **** ALL are Welcome ****