More about HKUST
A Survey of Communication Optimizations in Distributed Deep Learning
PhD Qualifying Examination Title: "A Survey of Communication Optimizations in Distributed Deep Learning" by Mr. Lin ZHANG Abstract: Nowadays, distributed deep learning (DL) has become a common practice in accelerating large deep neural networks (DNNs) training across multiple workers. However, such distributed training requires extensive communications. The communication overheads often consume a significant portion of the training time, and result in a severe performance bottleneck. How to address the communication issues has attracted much attention from both academia and industry to improve the system scalability. In this article, we present a survey of communication optimization techniques for data parallel distributed DL. In particular, we focus on system architecture design and communication scheduling algorithms. The system architecture defines how workers exchange information, and the communication scheduling algorithms can be applied into different architectures to better utilize the network capacity. These techniques are important as they do not change the training dynamics of the learning algorithms and can be directly integrated into existing DL frameworks. Furthermore, we find that distributed second-order algorithms have been emerged to accelerate distributed DNNs training with less iterations to convergence, compared to the first-order SGD algorithms. As existing communication solutions are mostly built on first-order algorithms, it motivates us to explore the opportunities of communication optimizations for the representative K-FAC algorithms. Date: Wednesday, 1 December 2021 Time: 2:00pm - 4:00pm Venue: Room 3494 (lifts 25/26) Committee Members: Prof. Bo Li (Supervisor) Dr. Wei Wang (Chairperson) Prof. Qiong Luo Dr. Yangqiu Song **** ALL are Welcome ****