More about HKUST
Generative World Models for Robot Learning: A Survey
PhD Qualifying Examination
Title: "Generative World Models for Robot Learning: A Survey"
by
Mr. Fangqi ZHU
Abstract:
Robotics is moving from task-specific controllers toward general-purpose
agents that can operate in open, dynamic environments. Vision-language-action
models have improved semantic understanding and action generation, but many
remain largely reactive: they select actions from observations and
instructions without explicitly predicting how those actions will change
objects, contacts, and future task states. This gap motivates generative
world models, which learn predictive representations of physical dynamics and
use them to imagine future observations, evaluate candidate behaviors,
synthesize robot data, support planning, and improve policies before costly
real-world execution. This survey reviews generative world models for robot
learning through two complementary roles. As external simulators, they
provide counterfactual rollouts for data generation, policy evaluation, model
predictive control, and imagined reinforcement learning. As internal
predictive modules, they embed foresight inside robot policies or
vision-language-action models, supporting inverse dynamics, action
generation, and embodied reasoning. Across latent world models, video
foundation models, action- conditioned robot simulators, and
world-model-based policy optimization, the survey focuses on representation,
action grounding, decision coupling, and reliability, arguing that useful
robot world models must move beyond visually plausible generation toward
action-faithful, physically grounded, uncertainty-aware, and efficient
prediction.
Date: Wednesday, 27 May 2026
Time: 11:30am - 1:00pm
Venue: Room 2128B
Lift 19
Committee Members: Prof. Song Guo (Supervisor)
Dr. Dan Xu (Chairperson)
Dr. Long Chen