A Survey on the GPU Multiplexing for DNN Inference and Training: A Bottom-up Approach

PhD Qualifying Examination


Title: "A Survey on the GPU Multiplexing for DNN Inference and Training: A 
Bottom-up Approach"

by

Mr. Haoxuan YU


Abstract:

The intermittent and uneven resource demands of deep neural network (DNN) 
workloads lead to suboptimal utilization of GPUs as multi-dimensional 
resources, a problem exacerbated by the rapid advancement of GPU capabilities. 
To enhance utilization of dominant GPUs, GPU multiplexing has become a widely 
adopted strategy, effectively reducing the total cost of ownership (TCO) of GPU 
clusters. However, the software infrastructure provided by hardware vendors 
lacks native support for GPU multiplexing, posing challenges for fine-grained 
GPU resource allocation and performance isolation.

This survey reviews recent research efforts on GPU multiplexing, with a focus 
on resource utilization and performance isolation. We analyze device drivers, 
programming toolkits, machine learning (ML) frameworks, and cluster management 
from the bottom up, exploring opportunities for improving GPU multiplexing from 
multiple perspectives. We hope this survey sheds light on system optimization 
for GPU multiplexing and facilitates future designs of GPU multiplexing 
software stacks.


Date:                   Tuesday, 20 August 2024

Time:                   3:00pm - 5:00pm

Venue:                  Room 5510
                        Lifts 25/26

Committee Members:      Prof. Song Guo (Supervisor)
                        Dr. Wei Wang (Chairperson)
                        Prof. Kai Chen
                        Prof. Qian Zhang