From Code Completion to Harness Engineering: The Evolving Design Space of Coding Agents

PhD Qualifying Examination


Title: "From Code Completion to Harness Engineering: The Evolving Design Space of
Coding Agents"

by

Mr. Ka Shun SHUM


Abstract:

Large language models for code have changed rapidly from systems that 
complete local fragments into agents that operate inside software 
environments. A simple chronological account—-from Codex to SWE-agent, Claude 
Code, and Qwen Code—-misses the central technical shift. The field has 
repeatedly redefined what "coding capability" means: first as the ability to 
generate correct programs from prompts, then as the ability to interact with 
tools, then as the ability to control expensive context and reuse procedural 
knowledge, and finally as the ability of a model and its surrounding harness 
to co-evolve around executable feedback. This survey reviews coding agents 
through that conceptual evolution. It begins with code generation, 
fill-in-the-middle editing, competitive programming, and execution-based 
verification as the model-centric substrate. It then studies the tool-loop 
era represented by repository-level issue resolution, SWE-bench, SWE-agent, 
OpenHands, and Agentless, where search, editing, shell commands, and tests 
become part of the computation. The survey next argues that long-horizon 
software engineering exposes a context bottleneck: agents must decide what to 
retrieve, compress, remember, forget, and package into reusable skills. 
Finally, it frames the current frontier as harness engineering, where 
permissions, state persistence, hooks, skills, subagents, evaluators, 
sandboxes, and training environments determine what a coding model can 
reliably do. The unifying thesis is that modern coding agents are 
harness-mediated software engineering systems rather than standalone code 
models. Their reliability depends not only on model scale, but also on 
higher-fidelity feedback, context and skill reuse, and model—-environment 
co-design. This perspective leads naturally to open research problems in 
process-level evaluation, verifier calibration, reward modeling, long-horizon 
dependability, safe autonomy, and self-improving skills. These problems 
define a research agenda for the verification and feedback layer of future 
coding-agent harnesses.


Date:                   Wednesday, 20 May 2026

Time:                   10:00am - 12:00noon

Venue:                  Room 2128A
                        Lift 19

Committee Members:      Dr. Junxian He (Supervisor)
                        Prof. Raymond Wong (Chairperson)
                        Dr. May Fung