The Cost of Collaboration: A Survey of Efficiency-Aware Multi-LLM Systems

PhD Qualifying Examination


Title: "The Cost of Collaboration: A Survey of Efficiency-Aware Multi-LLM 
Systems"

by

Mr. Haochen SHI


Abstract:

Multi-LLM systems extend single-agent LLMs by composing multiple model calls, 
agents, roles, memories, tools, and runtime resources into a coordinated 
computation. This horizontal scaling path can improve task performance through 
decomposition, specialization, parallel search, context isolation, and 
verification, but it also introduces substantial resource overhead in tokens, 
latency, tool calls, memory, model loading, serving throughput, and training 
or rollout cost. This survey studies how recent multi-LLM systems improve 
efficacy while controlling these costs.

We organize the literature into a two-level taxonomy. At the harness level, 
we review methods that optimize the logical structure of the system: 
capability allocation selects which model, role, expert, or subteam should 
execute each unit of work; work-graph control shapes task decomposition, 
branching, scheduling, pruning, and stopping; information-state control 
decides what each model call or agent sees, retrieves, stores, and shares; 
and external-action control covers emerging methods for budgeting tools, 
APIs, code execution, environments, and human clarification. At the device 
level, we review runtime methods that make these harness decisions efficient 
on concrete hardware, including workflow-aware scheduling, multi-LLM and 
multi-LoRA serving, KV-cache and context-state optimization, placement, 
batching, and training-time rollout efficiency.

Across these domains, we emphasize quality-resource trade-offs, baseline 
selection, and end-to-end accounting. The survey provides a map for 
designing multi-LLM systems that are not only more capable, but also 
cost-aware, deployable, and empirically accountable.


Date:                   Monday, 22 June 2026

Time:                   10:00am - 12:00pm

Venue:                  Room 3494
                        Lift 25/26

Committee Members:      Dr. Yangqiu Song (Supervisor)
                        Prof. Nevin Zhang (Chairperson)
                        Dr. May Fung