More about HKUST
Principles and Automation of Low-Level Optimizations on GPUs
PhD Thesis Proposal Defence Title: "Principles and Automation of Low-Level Optimizations on GPUs" by Mr. Da YAN Abstract: Performance optimizations on GPUs are not well-understood enough. This thesis discusses principles and automation of performance optimizations on NVIDIA GPUs, with a special focus on compute-bound kernels. This thesis focuses on the abstraction layers between portable virtual instruction sets (e.g., LLVM IR, NVIDIA PTX) and native hardware assembly. We first introduce the native GPU instruction set, Sharder ASSembly (SASS). Previously, the public cannot customize SASS generation as the only way to generate SASS is by using close-sourced proprietary compiler ptxas. ptxas hides many important optimizations including instruction scheduling. We built an open sourced assembler, TuringAs, for the public to manipulate SASS. And we identified new optimization opportunities at SASS level. For instance, using some native SASS instructions helps to reduce register pressure and reordering SASS instructions leads to better instruction-level parallelism thus increasing throughput. We evaluate the effectiveness of our optimizations with the examples of Winograd convolution (a fast convolution algorithm) and Tensor Core matrix multiplication. Next, we introduce our effort to automate SASS optimizations to promote productivity. Programming in SASS doesn't scale to a large number of kernels nor new GPU architectures. We built, GASS, an LLVM-based compiler that translates high-level virtual representation (i.e., LLVM IR) to optimized SASS automatically. We highlight our newly proposed instruction scheduler for compute-bound deep learning kernels, our customization of the if-conversion pass, and our algorithms to resolve data dependency. The evaluation shows that our algorithms in GASS outperform LLVM's algorithms by a considerable margin and GASS is on-par with highly optimized proprietary compiler ptxas. Date: Thursday, 2 December 2021 Time: 2:00pm - 4:00pm Venue: Room 4503 Lifts 25/26 Committee Members: Dr. Wei Wang (Supervisor) Dr. Lionel Parreaux (Chairperson) Prof. Qiong Luo Dr. Shuai Wang **** ALL are Welcome ****