More about HKUST
From Code Completion to Harness Engineering: The Evolving Design Space of Coding Agents
PhD Qualifying Examination
Title: "From Code Completion to Harness Engineering: The Evolving Design Space of
Coding Agents"
by
Mr. Ka Shun SHUM
Abstract:
Large language models for code have changed rapidly from systems that
complete local fragments into agents that operate inside software
environments. A simple chronological account—-from Codex to SWE-agent, Claude
Code, and Qwen Code—-misses the central technical shift. The field has
repeatedly redefined what "coding capability" means: first as the ability to
generate correct programs from prompts, then as the ability to interact with
tools, then as the ability to control expensive context and reuse procedural
knowledge, and finally as the ability of a model and its surrounding harness
to co-evolve around executable feedback. This survey reviews coding agents
through that conceptual evolution. It begins with code generation,
fill-in-the-middle editing, competitive programming, and execution-based
verification as the model-centric substrate. It then studies the tool-loop
era represented by repository-level issue resolution, SWE-bench, SWE-agent,
OpenHands, and Agentless, where search, editing, shell commands, and tests
become part of the computation. The survey next argues that long-horizon
software engineering exposes a context bottleneck: agents must decide what to
retrieve, compress, remember, forget, and package into reusable skills.
Finally, it frames the current frontier as harness engineering, where
permissions, state persistence, hooks, skills, subagents, evaluators,
sandboxes, and training environments determine what a coding model can
reliably do. The unifying thesis is that modern coding agents are
harness-mediated software engineering systems rather than standalone code
models. Their reliability depends not only on model scale, but also on
higher-fidelity feedback, context and skill reuse, and model—-environment
co-design. This perspective leads naturally to open research problems in
process-level evaluation, verifier calibration, reward modeling, long-horizon
dependability, safe autonomy, and self-improving skills. These problems
define a research agenda for the verification and feedback layer of future
coding-agent harnesses.
Date: Wednesday, 20 May 2026
Time: 10:00am - 12:00noon
Venue: Room 2128A
Lift 19
Committee Members: Dr. Junxian He (Supervisor)
Prof. Raymond Wong (Chairperson)
Dr. May Fung