Multimodal 2D/3D/4D Avatar Generation and Editing

The Hong Kong University of Science and Technology
Department of Computer Science and Engineering


PhD Thesis Defence


Title: "Multimodal 2D/3D/4D Avatar Generation and Editing"

By

Mr. Hongyu LIU


Abstract:

The high-fidelity creation and animation of digital avatars represent a 
critical frontier in modern computer vision and graphics, fueled by the 
accelerating demand for immersive experiences in the Metaverse, virtual 
reality (VR), and digital communication. This thesis addresses the 
multifaceted challenges of avatar synthesis, specifically navigating the 
complex interplay between geometric accuracy, appearance fidelity, and motion 
controllability across 2D, 3D, and 4D modalities. By leveraging a paradigm 
shift from traditional reconstruction pipelines to data-driven generative 
approaches—specifically Generative Adversarial Networks (GANs), Diffusion 
Models, and Autoregressive Models—this research proposes a series of novel 
frameworks to enhance digital human representation. In the 2D domain, we 
introduce CLCAE to improve GAN inversion for precise image editing and FFCLIP 
for free-form, text-driven manipulation through semantic modulations. For 
motion transfer, Human MotionFormer utilizes vision transformers to bridge 
the gap in complex human movement synthesis. Transitioning to 3D and 4D, we 
propose Follow-Your-Emoji for fine-controllable portrait animation and 
HeadArtist-VL, which employs a novel Self Score Distillation (SSD) loss to 
resolve the "Janus" artifact in text-to-3D head generation. Finally, we 
present AvatarArtist, a pioneering open-domain 4D avatarization model that 
utilizes a parameterized triplane representation and diffusion priors to 
synthesize animatable avatars across diverse artistic styles. Experimental 
results across standard benchmarks demonstrate that these methodologies 
significantly advance the state-of-the-art in rendering quality, identity 
preservation, and temporal coherence, establishing a robust foundation for 
the next generation of virtual humans.


Date:                   Friday, 16 January 2026

Time:                   2:30pm - 4:30pm

Venue:                  Room 5501
                        Lifts 25/26

Chairman:               Dr. Tracy Pik Yin MOK (ISD)

Committee Members:      Dr. Qifeng CHEN (Supervisor)
                        Dr. Xiaomin OUYANG
                        Dr. Dan XU
                        Prof. Ping TAN (ECE)
                        Dr. Hongsheng LI (CUHK)