More about HKUST
Network Compression via Quantization and Sparsification
The Hong Kong University of Science and Technology Department of Computer Science and Engineering PhD Thesis Defence Title: "Network Compression via Quantization and Sparsification" By Miss Lu HOU Abstract Deep neural network models, though very powerful and highly successful, are computationally expensive in terms of space and time. Recently, there have been a number of attempts on compressing the network weights. These attempts greatly reduce the network size, and allow the possibility of deploying deep models in resource-constrained environments. In this thesis, we focus on two kinds of network compression methods: quantization and sparsification. We first propose to directly minimize the loss w.r.t. the quantized weights by using the proximal Newton algorithm. We provide a closed-form solution for binarization, as well as an efficient approximate solution for ternarization and m-bit (where m > 2) quantization. To speed up distributed training of weight-quantized networks, we then propose to use gradient quantization to reduce the communication cost, and theoretically study how the combination of weight and gradient quantization affects convergence. In addition, since previous quantization methods usually have inferior performance on LSTMs, we study why training quantized LSTMs is difficult, and show that popular normalization schemes can help stabilize the training of quantized LSTMs. While weight quantization reduces redundancy in weight representation, network sparsification can reduce redundancy in the number of weights. To achieve a higher compression rate, we extend the previous quantization-only formulation to a more general network compression framework, which allows simultaneous quantization and sparsification. Finally, we find that sparse deep neural networks obtained by pruning resemble biological neural networks in many ways. Inspired by the power law distributions of many biological neural networks, we show that these pruned networks also exhibit properties of the power law, and these properties can be used for faster learning and smaller networks in continual learning. Date: Tuesday, 30 July 2019 Time: 10:30am - 12:30pm Venue: Room 3494 Lifts 25/26 Chairman: Prof. Chi-Ying Tsui (ISD) Committee Members: Prof. James Kwok (Supervisor) Prof. Kai Chen Prof. Dit-Yan Yeung Prof. Yuan Yao (MATH) Prof. Sungroh Yoon (Seoul National University) **** ALL are Welcome ****