SEMANTICS-DRIVEN FACE RECONSTRUCTION, PROMPT EDITING AND RELIGHTING WITH DIFFUSION MODELS

The Hong Kong University of Science and Technology
Department of Computer Science and Engineering


MPhil Thesis Defence


Title: "SEMANTICS-DRIVEN FACE RECONSTRUCTION, PROMPT EDITING AND RELIGHTING 
WITH DIFFUSION MODELS"

By

Mr. Hao ZHANG


Abstract:

The ability to create high-quality 3D faces from a single image has become 
increasingly important with wide applications in video conferencing, AR/VR, and 
advanced video editing in movie industries. In this paper, we propose Face 
Diffusion NeRF (FaceDNeRF), a new generative method to reconstruct high-quality 
Face NeRFs from single images, complete with semantic editing and relighting 
capabilities. FaceDNeRF utilizes high resolution 3D GAN inversion and expertly 
trained 2D latent-diffusion model, allowing users to manipulate and construct 
Face NeRFs in zero-shot learning without the need for explicit 3D data. With 
carefully designed illumination and identity preserving loss, as well as 
multi-modal pre-training, FaceDNeRF offers users unparalleled control over the 
editing process enabling them to create and edit face NeRFs using just 
single-view images, text prompts, and explicit target lighting. The advanced 
features of FaceDNeRF have been designed to produce more impressive results 
than existing 2D editing approaches that rely on 2D segmentation maps for 
editable attributes. Experiments show that our FaceDNeRF achieves exceptionally 
realistic results and unprecedented flexibility in editing compared with 
state-of-the-art 3D face reconstruction and editing methods.


Date:                   Monday, 26 August 2024

Time:                   11:00am - 1:00pm

Venue:                  Room 3494
                        Lifts 25/26

Chairman:               Prof. Dit-Yan YEUNG

Committee Members:      Prof. Chi-Keung TANG (Supervisor)
                        Dr. Qifeng CHEN