More about HKUST
Controllable Image Generation: Foundations, Methodologies, and Emerging Frontiers
PhD Qualifying Examination
Title: "Controllable Image Generation: Foundations, Methodologies, and Emerging
Frontiers"
by
Mr. Weiyan XIE
Abstract:
Recent advances in deep generative models, particularly the shift from
U-Net-based Latent Diffusion Models (LDMs) to Diffusion Transformers (DiTs),
have led to unprecedented quality in text-to-image synthesis. However,
standard generation methods still lack precise control over structural
constraints, spatial composition, and subject identity. This survey provides a
comprehensive taxonomy and analysis of Controllable Image Generation, a
rapidly evolving field focused on aligning generative priors with explicit
user intent.
We organize the literature into three main methodological areas: (1)
Structural Control, which traces the evolution of the ControlNet family from
foundational zero-convolution architectures to efficient, multi-condition
adaptations for DiT backbones; (2) Layout-Controllable Generation, which
covers techniques for spatial grounding through training-based adapters,
training-free attention modulation, and LLM-guided planning; and (3)
Subject-Controllable Generation, which surveys approaches for identity
preservation, ranging from optimization-heavy fine-tuning to instant
encoder-based personalization and multi-subject composition.
Beyond these three axes, this survey highlights a paradigm shift toward
Unified Understanding-Generation Architectures that bring visual understanding
and generation together within a single framework. We explore this emerging
frontier and examine how these architectures can be applied to controllable
image generation via unified visual understanding and generation. In the
survey, we also provide a comparative analysis of state-of-the-art methods
across key benchmarks for different controllable generation tasks, and discuss
open challenges along with the potential future trajectory of the field.
Date: Wednesday, 4 March 2026
Time: 10:00am - 12:00noon
Venue: Room 2132C
Lift 19
Committee Members: Prof. Nevin Zhang (Supervisor)
Dr. Dan Xu (Chairperson)
Dr. Long Chen