Principles and Automation of Low-Level Optimizations on GPUs

PhD Thesis Proposal Defence


Title: "Principles and Automation of Low-Level Optimizations on GPUs"

by

Mr. Da YAN


Abstract:

Performance optimizations on GPUs are not well-understood enough. This thesis 
discusses principles and automation of performance optimizations on NVIDIA 
GPUs, with a special focus on compute-bound kernels. This thesis focuses on the 
abstraction layers between portable virtual instruction sets (e.g., LLVM IR, 
NVIDIA PTX) and native hardware assembly.

We first introduce the native GPU instruction set, Sharder ASSembly (SASS). 
Previously, the public cannot customize SASS generation as the only way to 
generate SASS is by using close-sourced proprietary compiler ptxas. ptxas hides 
many important optimizations including instruction scheduling. We built an open 
sourced assembler, TuringAs, for the public to manipulate SASS. And we 
identified new optimization opportunities at SASS level. For instance, using 
some native SASS instructions helps to reduce register pressure and reordering 
SASS instructions leads to better instruction-level parallelism thus increasing 
throughput. We evaluate the effectiveness of our optimizations with the 
examples of Winograd convolution (a fast convolution algorithm) and Tensor Core 
matrix multiplication.

Next, we introduce our effort to automate SASS optimizations to promote 
productivity. Programming in SASS doesn't scale to a large number of kernels 
nor new GPU architectures. We built, GASS, an LLVM-based compiler that 
translates high-level virtual representation (i.e., LLVM IR) to optimized SASS 
automatically. We highlight our newly proposed instruction scheduler for 
compute-bound deep learning kernels, our customization of the if-conversion 
pass, and our algorithms to resolve data dependency. The evaluation shows that 
our algorithms in GASS outperform LLVM's algorithms by a considerable margin 
and GASS is on-par with highly optimized proprietary compiler ptxas.


Date:			Thursday, 2 December 2021

Time:                  	2:00pm - 4:00pm

Venue: 			Room 4503
 			Lifts 25/26

Committee Members:	Dr. Wei Wang (Supervisor)
  			Dr. Lionel Parreaux (Chairperson)
 			Prof. Qiong Luo
 			Dr. Shuai Wang


**** ALL are Welcome ****