More about HKUST
In Search of Geometric Fidelity and Visual Alignment for Generating 3D Objects
PhD Qualifying Examination
Title: "In Search of Geometric Fidelity and Visual Alignment for Generating
3D Objects"
by
Mr. Kaiyi ZHANG
Abstract:
Driven by increasing demands in virtual reality, gaming, and industrial
design, the field of 3D object generation is advancing rapidly. However,
achieving stable, professional- grade quality remains a significant
challenge, primarily due to two critical issues: poor geometric fidelity and
incoherent visual alignment. We observe that current methods often produce
over-smoothed surfaces or structurally inconsistent shapes. Our analysis
reveals that these issues stem from distinct fundamental causes. We attribute
poor geometric fidelity to the limitations of mainstream latent
representations, specifically, the trade-offs between VecSets, which discard
high-frequency details, and Sparse Grids, which face computational hurdles.
Furthermore, we link structural inconsistencies to the generative models’
insufficient semantic understanding of conditioning images, as existing
feature extractors often fail to capture the necessary information for
spatially coherent generation.
This survey explores strategies to address these challenges by rethinking
data representations and integration methods. We introduce the Latent
Flexible Grid representation as a balanced solution that robustly handles
irregular topologies and enables local editability. Additionally, we examine
the potential of the Sparse Grid representation for scaling up resolution. To
fully leverage sparse grids, we propose the Watertight Geometry
Standardization pipeline, a data-centric approach that normalizes diverse
mesh formats into a consistent, high-quality training dataset. Finally, to
improve visual alignment, we discuss strategies for incorporating robust
semantic priors into the generation pipeline. We argue that leveraging
Vision-Language Models and enforcing representation alignment with spatial
information can effectively guide models to produce geometrically precise and
structurally plausible results.
Date: Monday, 15 December 2025
Time: 10:00am - 12:00pm
Venue: Room 2128A
Lift 19
Committee Members: Prof. Long Quan (Supervisor)
Dr. Dan Xu (Chairperson)
Dr. Long Chen