Post-Training for Visual Synthesis: Safe and Powerful Generation

PhD Thesis Proposal Defence


Title: "Post-Training for Visual Synthesis: Safe and Powerful Generation"

By

Mr. Runtao LIU


Abstract:

Visual generative models are powerful but hard to deploy safely and align
with user preferences. For safety aspect, we study two different approaches:
external input filtering and internal model-side alignment. We develop
LatentGuard which is an input-side, representation-space blacklist detector
robust to paraphrases and prompt obfuscation. Also for internal knowledge
erasing, we propose AlignGuard, a model-side approach that applies DPO-tuned
LoRA safety experts merged via CoMerge to steer diffusion away from unsafe
content. To train at scale, we build the CoPro/CoProV2 series dataset, a
fully automatic collection of paired (harmful, safe) prompts(and images)
spanning 728 of concepts. For preference alignment in visual generation, we
introduce VideoDPO, trained on the data generated by OmniScore which is a
joint measure of visual quality and semantic faithfulness. We automatically
construct preference pairs and use OmniScore-Based Re-Weighting to emphasize
informative samples. Across popular visual-generation backbones, our approach
expands safety coverage with minimal impact on benign creativity in T2I and
improves fidelity and prompt following in T2V. In sum, we provide
explorations on both important sides: to build safer and more aligned visual
generative models.


Date:                   Friday, 17 October 2025

Time:                   10:00am - 12:00noon

Venue:                  Room 4475
                        Lifts 25/26

Committee Members:      Dr. Qifeng Chen (Supervisor)
                        Dr. Long Chen (Chairperson)
                        Dr. May Fung