A Survey of LLM Agent Serving Through the Lens of Context Engineering

PhD Qualifying Examination


Title: "A Survey of LLM Agent Serving Through the Lens of Context Engineering"

by

Mr. Yukun ZHOU


Abstract:

Large language model (LLM) agents do not solve tasks as isolated prompt 
calls. They repeatedly read external state, invoke tools, update memories, 
edit artifacts, branch into subproblems, and return to the model with a new 
working context. The systems problem is therefore not long-prompt execution 
alone, but repeated materialization of bounded model-visible context from 
persistent, mutable, and partially shared state.

This survey studies context engineering as the source of LLM agent serving 
workloads. We first organize agent-side techniques into three recurring 
functions: retaining and compressing past state, selecting and grounding 
external state, and coordinating reusable context across calls, branches, and 
agents. We then translate these functions into serving-visible signals, 
including context lifetime, volatility, provenance, materialization latency, 
state validity, and branch structure. These signals induce three system 
requirement families: context lifecycle management, critical-path 
materialization of external state, and workflow execution for correlated 
calls.

Finally, we use these requirements to interpret existing serving mechanisms. 
First, KVCache management, long-context execution, and phase scheduling 
reduce the cost of admitted tokens. Second, retrieval, tool-output 
processing, and artifact views prepare the next context before inference. 
Third, prefix reuse, prompt modules, cache handles, checkpointing, and 
workflow runtimes connect repeated calls. This mapping shows that current 
mechanisms cover important parts of agent serving, but rarely share a common 
model of context as structured, versioned, reusable state with provenance. We 
close with evaluation gaps and interface opportunities for exposing context 
regions, validity dependencies, branch scope, and scheduling priority.


Date:                   Monday, 11 May 2026

Time:                   10:00am - 12:00noon

Venue:                  Room 3494
                        Lift 25/26

Committee Members:      Dr. Wei Wang (Supervisor)
                        Dr. Binhang Yuan (Co-supervisor)
                        Prof. Song Guo (Chairperson)
                        Dr. Chaojian Li