More about HKUST
Post-Training for Visual Synthesis: Safe and Powerful Generation
The Hong Kong University of Science and Technology
Department of Computer Science and Engineering
PhD Thesis Defence
Title: "Post-Training for Visual Synthesis: Safe and Powerful Generation"
By
Mr. Runtao LIU
Abstract:
Visual generative models are powerful but difficult to deploy safely and
align with user preferences. This thesis studies post-training methods that
improve both safety and generation quality for text-to-image and
text-to-video systems.
We first develop LatentGuard, an input-side representation-space blacklist
detector that is robust to paraphrases and prompt obfuscation. For internal
safety alignment, we propose AlignGuard, a model-side approach that trains
DPO-tuned LoRA safety experts and merges them with CoMerge to steer diffusion
models away from unsafe content. To support scalable safety training, we
build the CoPro/CoProV2 dataset series, a fully automatic collection of
paired harmful and safe prompts and images spanning 723 concepts.
Beyond safety, we introduce VideoDPO for text-to-video preference alignment.
VideoDPO uses OmniScore, a joint measure of visual quality and semantic
faithfulness, to automatically construct preference pairs and re-weight
informative samples. Finally, we present DisRM, a discriminative reward
modeling framework that replaces large pairwise preference datasets with a
small set of representative Preference Proxy Data and supports iterative
post-training through sample selection, Supervised Fine-Tuning, and Direct
Preference Optimization. Together, these methods broaden safety coverage with
limited impact on benign creativity and improve fidelity, prompt following,
and reward modeling efficiency across visual-generation backbones.
Date: Thursday, 4 June 2026
Time: 10:00am - 12:00noon
Venue: Room 3494
Lifts 25/26
Chairman:
Committee Members: Dr. Qifeng CHEN (Supervisor)
Dr. May FUNG
Dr. Ling PAN
Dr. Wenhan LUO (AMC)
Dr. Hongsheng LI (CUHK)