Domain-Specific Network Techniques for Distributed Deep Learning: A Survey

PhD Qualifying Examination


Title: "Domain-Specific Network Techniques for Distributed Deep Learning: A
Survey"

by

Miss Wenxue LI


Abstract:

Deep Learning (DL) has witnessed remarkable success in the past decade,
enabling significant advancements across various applications. The increasing
capability of DL models has led to the development of distributed deep learning
(DDL) systems, where communication becomes a significant bottleneck that
impacts the end-to-end performance of both training and serving. As a result,
there is a growing interest in designing efficient domain-specific network
techniques for DDL.

In this survey, we present an up-to-date and thorough introduction to existing
domainspecific networked systems for DDL. We first provide the background of
DDL, including training, inference, and existing parallelism strategies. Then
we focus on recent domainspecific network techniques proposed for accelerating
DDL, such as communicationcomputation overlapping, priority-based communication
scheduling, topology-parallelism co-design, network-aware cluster scheduler,
and optimizations aimed at reducing large language model (LLM) inference
latency. In closing, we present several directions for future research.


Date:                   Wednesday, 15 November 2023

Time:                   3:00pm - 5:00pm

Venue:                  Room 2126D
                        lift 19

Committee Members:      Prof. Kai Chen (Supervisor)
                        Dr. Qifeng Chen (Chairperson)
                        Prof. Gary Chan
                        Dr. Binhang Yuan


**** ALL are Welcome ****