More about HKUST
From Model Robustness to Harness-Level Protection: A Survey on Security of Long-Context LLM Agents
PhD Qualifying Examination Title: "From Model Robustness to Harness-Level Protection: A Survey on Security of Long-Context LLM Agents" by Mr. Yanbo DAI Abstract: Large language model (LLM) agents have evolved from single-turn chat systems into long-horizon autonomous systems. They plan over extended contexts, retrieve from external memory, invoke tools, execute code, and coordinate with other agents within an execution harness. This evolution changes the appropriate unit of security analysis. Model robustness remains essential, but it is no longer sufficient. Even a well-aligned model may produce an unsafe trajectory if the harness exposes over-permissive tools, sends private context to unauthorized recipients, or allows untrusted data to affect control flow. This survey organizes the literature around two complementary layers. The first, model robustness, concerns the model's intrinsic resistance to jailbreaks, prompt injection, poisoning, backdoors, and information extraction. The second, harness-level protection, concerns the enforcement of permission, information-flow, and coordination boundaries throughout the execution trajectory. We make four contributions. First, we propose a two-layer taxonomy that separates model robustness from harness-level protection while explaining how long context couples the two. Second, we construct a unified attack— defense map. It connects model-level mechanisms, including alignment, jailbreaks, direct prompt injection, poisoning, backdoors, and privacy extraction, with their harness-level counterparts, including indirect injection, memory poisoning, tool and code abuse, information leakage, and multi-agent propagation. Third, we model the harness as a policy-constrained execution system governed by permission policies, information-flow policies, and coordination policies. We relate these policies to trajectory-level properties such as boundary compliance, execution fidelity, and system stability. Fourth, we connect this framework to benchmarks, runtime auditing, assurance, governance, and open research challenges. A recurring theme is that long context amplifies security risk. Violations can accumulate as trajectories grow, memory can preserve adversarial state across steps, and many-shot contexts can enable attacks that short-context models resist. Agent safety should therefore be evaluated over the entire execution trajectory, rather than only through the final response. Date: Wednesday, 24 June 2026 Time: 1:00pm - 3:00pm Zoom Meeting: https://hkust.zoom.us/j/96184667833?pwd=rg9yi3hEdPkLSpRdbjpbi3c4SrDmM1.1 Committee Members: Dr. Shuai Wang (Supervisor) Dr. Binhang Yuan (Chairperson) Dr. Dan Xu