More about HKUST
Towards Controllable Video Generation with Diffusion Models
PhD Qualifying Examination
Title: "Towards Controllable Video Generation with Diffusion Models"
by
Mr. Yihao MENG
Abstract:
Controllable video generation has gained broad importance in computer vision
due to its potential in creative content production. While text-driven
approaches offer a convenient interface, they often grapple with identity
consistency, fine-grained motion control, and nuanced camera viewpoints that
limit their applicability in many real-world scenarios. This challenge
underscores the need for more robust methods capable of synthesizing
realistic, temporally coherent videos with explicit control. Recently,
diffusion-based generative modeling has emerged as a powerful framework,
demonstrating considerable success in producing high-fidelity images and
videos. In this survey, we investigate how these models can facilitate
controllable video synthesis, highlighting the limitations of text-only
guidance and underscoring the role of additional control signals such as
semantic masks, pose keypoints, and camera parameters. We propose a taxonomy
that encompasses three core dimensions of controllability—appearance,
motion, and camera—and review current methods through this lens. By
synthesizing key findings and challenges, this work aims to guide future
research toward developing flexible, robust, and high-quality
diffusion-based solutions for controllable video generation.
Date: Monday, 26 May 2025
Time: 10:00am - 12:00noon
Venue: Room 2128A
Lift 19
Committee Members: Prof. Huamin Qu (Supervisor)
Dr. Qifeng Chen (Chairperson)
Prof. Pedro Sander
Dr. Anyi Rao (AMC)