Runtime-First Grounded Guidance in Hybrid Autoregressive-Diffusion Generation

The Hong Kong University of Science and Technology
Department of Computer Science and Engineering

Final Year Thesis Oral Defense

Title: "Runtime-First Grounded Guidance in Hybrid Autoregressive-Diffusion 
Generation"

by

GAO Yitang

Abstract:

This thesis investigates grounded guidance for generation, beginning with 
text-side classifier-free guidance and later focusing on hybrid 
autoregressive-diffusion image generation. The work shows that numerically 
stable guidance alone is not sufficient; the weak reference must remain 
informative, and the most effective correction occurs at the diffusion-side 
refinement stage. Using degraded but structured references together with 
projected residual control, the study obtains consistent gains under matched 
comparisons and identifies a replay-validated runtime teacher as the 
strongest controller in the final system. The thesis also evaluates later 
internalization and search-based follow-up lines, including native 
residual-head placement and ranked-candidate search, and positions them as 
mechanism studies and future directions rather than stronger replacements. 
Overall, the project contributes a runtime-first grounded-guidance framework 
that localizes where correction is effective, documents its behavior under 
controlled comparisons, and preserves a clear experimental lineage for final 
thesis reporting.

Date            : 5 May 2026 (Tuesday)

Time            : 14:50 - 15:30

Venue           : Room 2131B (near lift 19), HKUST

Advisor         : Dr. CHEN Long

2nd Reader      : Dr. XU Dan