Integer-Only Training and Low-Bit Quantization of Deep Neural Networks without Graphic Processing Units

PhD Thesis Proposal Defence


Title: "Integer-Only Training and Low-Bit Quantization of Deep Neural 
Networks without Graphic Processing Units"

by

Mr. Jaewoo SONG


Abstract:

Deep neural networks (DNNs) have become essential in diverse fields, driven 
by advancements in graphics processing units (GPUs). However, the increasing 
complexity of DNN models requires substantial computational resources, 
posing challenges for training and inference without access to 
high-performance hardware. This research addresses these challenges by 
exploring integer-based approaches to DNN training and inference, aiming to 
reduce storage and computational demands. For training, the study introduces 
a novel application of direct feedback alignment (DFA) to enable 
integer-only training, overcoming the limitations of backpropagation, which 
is prone to integer overflow. The developed framework, PocketNN, implemented 
in pure C++, demonstrates successful integer-only training and has been 
adopted by other research projects for its compatibility with low-power edge 
devices. For inference, the study specifically targeted the quantization of 
large language models (LLMs). The research proposes SplitQuantV2, an 
innovative algorithm that splits layers with weights in DNNs to increase 
quantization resolution. While most of quantization algorithms specialized 
for LLMs require high-end GPUs and calibration dataset, SplitQuantV2 does 
not require neither of them and can run efficiently with CPUs. Tested on 
Llama 3.2 1B Instruct model, SplitQuantV2 significantly improves the 
accuracy of INT4 quantized models, achieving results comparable to their 
floating-point counterparts, while requiring minimal computational 
resources. Together, PocketNN and SplitQuantV2 advance the field of 
integer-based DNN training and quantization. Promoting the deployment of 
artificial intelligence (AI) in environments with limited hardware 
capabilities and contributing to energy conservation, this research will 
ultimately broaden the accessibility and applicability of AI technologies.


Date:                   Monday, 17 February 2025

Time:                   9:45am - 11:30am

Venue:                  Room 4472
                        Lifts 25/26

Committee Members:      Prof. Fangzhen Lin (Supervisor)
                        Prof. Dit-Yan Yeung (Chairperson)
                        Dr. Shuai Wang
Privacy Sitemap
Integer-Only Training and Low-Bit Quantization of Deep Neural Networks without Graphic Processing Units

About

People

Research

Academics

Admissions