More about HKUST
From Virtual to Physical: Evolving Visual Generation Agents across Simulated, Real, and Physics-Aware Environments
PhD Qualifying Examination
Title: "From Virtual to Physical: Evolving Visual Generation Agents across
Simulated, Real, and Physics-Aware Environments"
by
Mr. Jinxiang LAI
Abstract:
The field of Visual Generation Agents (VGAs) is undergoing a rapid evolution
from single-shot generation toward multi-step interactive workflows. Despite
this momentum, the area still lacks a systematic survey that clarifies its
core challenges, training paradigms, and future directions. Through a
problem-driven analysis, this paper first identifies three properties that
fundamentally distinguish VGAs from conventional Large Language Model (LLM)
agents: (1) tools are stochastic generation systems, which conflates training
signals and obscures responsibility attribution; (2) rewards are subjective,
multi-dimensional, and creative, which introduces evaluation uncertainty; and
(3) outputs must obey physical laws, which imposes strong generative
constraints. These visual-specific properties give rise to six core challenges
that VGAs face during training and deployment, which we systematically review
together with their representative solutions. Building on this analysis, we
observe that VGA training paradigms naturally diverge into three complementary
environments, namely Virtual, Real, and Physics-Aware, which respectively
address training affordability, training correctness, and training fidelity.
Accordingly, we propose a Virtual—Real— Physics-Aware co-training framework,
in which three sequential stages jointly produce agents capable of
orchestrating diverse tools to generate content that is both creative and
physically faithful. Finally, we outline future directions on token-level
reward optimization, personalized agents, and 3D-native physics-faithful
generation. By identifying the core challenges and training paradigms of VGAs
and outlining a co-training framework that traces the evolution from virtual
simulation to real physical environments, this survey aims to provide a clear
roadmap and theoretical foundation for future research.
Date: Wednesday, 10 June 2026
Time: 2:00pm - 4:00pm
Venue: Room 3494
Lift 25/26
Committee Members: Prof. Song Guo (Supervisor)
Dr. Dan Xu (Chairperson)
Dr. Zihan Zhang