More about HKUST
Towards Controllable Video Generation with Diffusion Models
PhD Qualifying Examination Title: "Towards Controllable Video Generation with Diffusion Models" by Mr. Yihao MENG Abstract: Controllable video generation has gained broad importance in computer vision due to its potential in creative content production. While text-driven approaches offer a convenient interface, they often grapple with identity consistency, fine-grained motion control, and nuanced camera viewpoints that limit their applicability in many real-world scenarios. This challenge underscores the need for more robust methods capable of synthesizing realistic, temporally coherent videos with explicit control. Recently, diffusion-based generative modeling has emerged as a powerful framework, demonstrating considerable success in producing high-fidelity images and videos. In this survey, we investigate how these models can facilitate controllable video synthesis, highlighting the limitations of text-only guidance and underscoring the role of additional control signals such as semantic masks, pose keypoints, and camera parameters. We propose a taxonomy that encompasses three core dimensions of controllability—appearance, motion, and camera—and review current methods through this lens. By synthesizing key findings and challenges, this work aims to guide future research toward developing flexible, robust, and high-quality diffusion-based solutions for controllable video generation. Date: Monday, 26 May 2025 Time: 10:00am - 12:00noon Venue: Room 2128A Lift 19 Committee Members: Prof. Huamin Qu (Supervisor) Dr. Qifeng Chen (Chairperson) Prof. Pedro Sander Dr. Anyi Rao (AMC)