Application-Aware Communication Optimization for Distributed Systems

PhD Thesis Proposal Defence


Title: "Application-Aware Communication Optimization for Distributed Systems"

by

Mr. Xudong LIAO


Abstract:

As modern datacenters scale to support increasingly complex and 
data-intensive applications, overall system efficiency remains a persistent 
optimization goal across the computing stack. While decades of research have 
pushed the boundaries of computation, communication continues to be treated 
largely as an infrastructure-level concern, abstracted away from application 
semantics. This paradigm, however, falls short in contemporary systems where 
communication patterns are tightly coupled with workload behaviors and 
runtime dynamics, ultimately limiting the ability to optimize system 
performance in a holistic and principled manner.

This dissertation advocates for a shift toward application-aware 
communication optimization—a design paradigm that leverages application 
characteristics to guide communication scheduling, resource allocation, and 
interconnect configuration. By embracing this principle, we show that 
distributed systems can achieve significantly improved performance, 
scalability, and responsiveness across a broad range of scenarios.

We begin with Pallas, a rack-scale CPU scheduling system targeting 
microsecond-level services. Pallas introduces an in-network workload shaping 
mechanism that partitions mixed workloads into homogeneous shards at the 
top-of-rack switch. This design enables simple yet near-optimal scheduling 
within each server and reduces tail latency under dynamic load patterns. 
Pallas demonstrates that proactive, application-aware scheduling at the 
network level can effectively improve datacenter responsiveness.

Second, we present Herald, a neural recommendation training system for deep 
learning recommendation models. Herald leverages the sparse and predictable 
access patterns of embedding layers to perform location-aware input 
assignment and dynamic communication plan generation. As a result, it 
significantly reduces redundant data transfers and accelerates training. 
Herald exemplifies how application semantics at the model layer can inform 
efficient communication scheduling in machine learning (ML) training 
pipelines.

Third, we propose MixNet, a runtime reconfigurable optical-electrical 
interconnect architecture designed for large-scale Mixture-of-Experts (MoE) 
training. MixNet regionally adapts its physical topology to match evolving 
communication patterns across training iterations. By fusing the flexibility 
of optical switching with the reach of electrical fabrics, MixNet approaches 
the performance of ideal topologies while maintaining practical 
cost-efficiency and scalability. MixNet demonstrates that fine-grained, 
application-aware topology reconfiguration can unlock new trade-offs in 
distributed ML interconnect design.


Date:                   Friday, 22 August 2025

Time:                   4:00pm - 6:00pm

Venue:                  Room 3494
                        Lifts 25/26

Committee Members:      Prof. Kai Chen (Supervisor)
                        Dr. Binhang Yuan (Chairperson)
                        Prof. Song Guo