More about HKUST
Software-Hardware Co-Optimization for High-Performance, Resource-Efficient, and Secured GPU Cloud Platforms
PhD Qualifying Examination Title: "Software-Hardware Co-Optimization for High-Performance, Resource-Efficient, and Secured GPU Cloud Platforms" by Mr. Yongkang ZHANG Abstract: To maximize resource utilization in data centers, cloud service providers often colocate high-priority, latency-sensitive (LS) GPU tasks with low-priority, best-effort (BE) GPU tasks—referred to as tenants—on the same GPU. While recent research has explored improving the quality of service (QoS), resource efficiency, and security in multi-tenant GPU cloud platforms through software and hardware innovations, there remains a lack of a systematic review from a software-hardware co-optimization perspective. This gap limits researchers' ability to holistically optimize multi-tenant GPU cloud platforms. This survey addresses this need by first analyzing commodity GPU architectures and identifying key bottlenecks in guaranteeing QoS and security of GPU sharing. It then reviews existing optimization approaches for multi-tenant GPU cloud platforms—including software-based, OS-level, and hardware-level approaches—that aim to enhance performance, reduce resource costs, and bolster the security of multi-tenant GPU platforms. Finally, this work highlights promising research opportunities for co-optimizing software and hardware stacks, providing a comprehensive framework to bridge research gaps and inspire advancements in multi-tenant GPU sharing. Date: Wednesday, 12 March 2025 Time: 4:00pm - 6:00pm Venue: Room 3598 Lifts 27/28 Committee Members: Dr. Shuai Wang (Supervisor) Prof. Xiaowen Chu (Co-supervisor, HKUST-GZ) Dr. Binhang Yuan (Chairperson) Dr. Xiaomin Ouyang