Scalable 3D/4D Scene Representations for Grounded Spatial Intelligence

The Hong Kong University of Science and Technology
Department of Computer Science and Engineering


PhD Thesis Defence


Title: "Scalable 3D/4D Scene Representations for Grounded Spatial Intelligence"

By

Mr. Xinhang LIU


Abstract:

Recent AI advancements in LLMs and VLMs still struggle with robust spatial
understanding and physical interaction due to a lack of structured
representations for scene geometry, correspondence, and motion. This thesis
investigates how to design 3D and 4D scene representations that make
structural and dynamic information persistent, queryable, and learnable at
scale. We first extend the neural rendering paradigm for dynamic scenes—using
layered representations, diffusion-augmented sparse-view reconstruction, and
motion-aware spatio-temporal sampling—to improve visual fidelity. However,
these rendering-first pipelines reveal key limitations: they lack intrinsic
temporal correspondence and face scalability issues due to scene-specific
optimization. To overcome this, we introduce Trajectory Fields and Trace
Anything, a feed-forward, pixel-aligned 4D framework that directly encodes
dense cross-time correspondence. This formulation enables explicit tracking
and querying of scene elements an order of magnitude faster than
optimization-based alternatives. Supported by a new data platform and
benchmark, this shifts dynamic scene modeling toward scalable representation
learning. Building on this foundation, we explore spatial intelligence tasks
like tracking and prediction, highlighting Point4Cast for unifying perception
and future-oriented geometric reasoning within a single representation.
Ultimately, this thesis establishes a paradigm shift from rendering-centric
reconstruction to representation-centric state modeling, arguing that grounded
spatial intelligence requires unified, persistent, and scalable 3D/4D state
spaces.


Date:                   Friday, 8 May 2026

Time:                   11:00am - 1:00pm

Venue:                  Room 2132C
                        Lift 22

Chairman:               Prof. Yaping GONG (MGMT)

Committee Members:      Prof. Chi-Keung TANG (Supervisor)
                        Prof. Pedro SANDER
                        Dr. Dan XU
                        Prof. Ping TAN (ECE)
                        Prof. Michael Brown (York University)