Scalable 3D Generation: From Leveraging Foundation Models to Building Data Engines

The Hong Kong University of Science and Technology
Department of Computer Science and Engineering


PhD Thesis Defence


Title: "Scalable 3D Generation: From Leveraging Foundation Models to Building 
Data Engines"

By

Miss Chenhan JIANG


Abstract:

The rapid advancement of 2D foundation models has unlocked promising 
capabilities in 3D content creation. However, directly lifting 2D models to 
3D often leads to severe artifacts, such as the multi-face Janus problem, 
distorted geometric structures, and user-intent misalignment. These 
limitations stem primarily from the inherent lack of 3D spatial understanding 
in 2D priors. Conversely, training native 3D models is fundamentally 
bottlenecked by the scarcity of massive, high-quality 3D training data.

To overcome these bottlenecks, this thesis aims to build a high-fidelity, 
scalable 3D generative system. We achieve this through a systematic 
transition: from leveraging existing 2D foundation models to constructing 
underlying data engines for native 3D model training. This research focuses 
on two core aspects:

First, we demonstrate how to effectively harness 2D priors to produce 
geometrically and textually consistent 3D content. We introduce a novel 
agentic pipeline designed to generate structured, textured, and highly 
editable 3D assets, mitigating the inherent flaws of direct 2D-to-3D lifting. 
Second, to break the data ceiling restricting the evolution of native 3D 
models, we introduce scalable data engines. By leveraging real-world data, we 
construct multimodal-aligned feature spaces and significantly scale up 
training datasets, thereby fundamentally enhancing native novel view 
synthesis models. By integrating foundation model priors, agentic generative 
frameworks, and scalable data engines, this research contributes to the 
essential infrastructure of 3D content creation. This thesis will conclude 
with a forward-looking perspective on the transition toward native 3D 
foundation models and a roadmap for seamlessly empowering modern graphics 
workflows.


Date:                   Tuesday, 26 May 2026

Time:                   10:00am - 12:00noon

Venue:                  Room 2128A
                        Lift 19

Chairman:               Dr. Qing CHEN (MECH)

Committee Members:      Prof. Dit-Yan YEUNG (Supervisor)
                        Prof. Fangzhen LIN
                        Dr. Dan XU
                        Dr. Fangneng ZHAN (AMC)
                        Prof. Guanbin LI (Sun Yat-Sen University)