Controllable Image Generation: Foundations, Methodologies, and Emerging Frontiers

PhD Qualifying Examination


Title: "Controllable Image Generation: Foundations, Methodologies, and Emerging
Frontiers"

by

Mr. Weiyan XIE


Abstract:

Recent advances in deep generative models, particularly the shift from 
U-Net-based Latent Diffusion Models (LDMs) to Diffusion Transformers (DiTs), 
have led to unprecedented quality in text-to-image synthesis. However, 
standard generation methods still lack precise control over structural 
constraints, spatial composition, and subject identity. This survey provides a 
comprehensive taxonomy and analysis of Controllable Image Generation, a 
rapidly evolving field focused on aligning generative priors with explicit 
user intent.

We organize the literature into three main methodological areas: (1) 
Structural Control, which traces the evolution of the ControlNet family from 
foundational zero-convolution architectures to efficient, multi-condition 
adaptations for DiT backbones; (2) Layout-Controllable Generation, which 
covers techniques for spatial grounding through training-based adapters, 
training-free attention modulation, and LLM-guided planning; and (3) 
Subject-Controllable Generation, which surveys approaches for identity 
preservation, ranging from optimization-heavy fine-tuning to instant 
encoder-based personalization and multi-subject composition.

Beyond these three axes, this survey highlights a paradigm shift toward 
Unified Understanding-Generation Architectures that bring visual understanding 
and generation together within a single framework. We explore this emerging 
frontier and examine how these architectures can be applied to controllable 
image generation via unified visual understanding and generation. In the 
survey, we also provide a comparative analysis of state-of-the-art methods 
across key benchmarks for different controllable generation tasks, and discuss 
open challenges along with the potential future trajectory of the field.


Date:                   Wednesday, 4 March 2026

Time:                   10:00am - 12:00noon

Venue:                  Room 2132C
                        Lift 19

Committee Members:      Prof. Nevin Zhang (Supervisor)
                        Dr. Dan Xu (Chairperson)
                        Dr. Long Chen