More about HKUST
Integer-Only Training and Low-Bit Quantization of Deep Neural Networks without Graphic Processing Units
PhD Thesis Proposal Defence
Title: "Integer-Only Training and Low-Bit Quantization of Deep Neural
Networks without Graphic Processing Units"
by
Mr. Jaewoo SONG
Abstract:
Deep neural networks (DNNs) have become essential in diverse fields, driven
by advancements in graphics processing units (GPUs). However, the increasing
complexity of DNN models requires substantial computational resources,
posing challenges for training and inference without access to
high-performance hardware. This research addresses these challenges by
exploring integer-based approaches to DNN training and inference, aiming to
reduce storage and computational demands. For training, the study introduces
a novel application of direct feedback alignment (DFA) to enable
integer-only training, overcoming the limitations of backpropagation, which
is prone to integer overflow. The developed framework, PocketNN, implemented
in pure C++, demonstrates successful integer-only training and has been
adopted by other research projects for its compatibility with low-power edge
devices. For inference, the study specifically targeted the quantization of
large language models (LLMs). The research proposes SplitQuantV2, an
innovative algorithm that splits layers with weights in DNNs to increase
quantization resolution. While most of quantization algorithms specialized
for LLMs require high-end GPUs and calibration dataset, SplitQuantV2 does
not require neither of them and can run efficiently with CPUs. Tested on
Llama 3.2 1B Instruct model, SplitQuantV2 significantly improves the
accuracy of INT4 quantized models, achieving results comparable to their
floating-point counterparts, while requiring minimal computational
resources. Together, PocketNN and SplitQuantV2 advance the field of
integer-based DNN training and quantization. Promoting the deployment of
artificial intelligence (AI) in environments with limited hardware
capabilities and contributing to energy conservation, this research will
ultimately broaden the accessibility and applicability of AI technologies.
Date: Monday, 17 February 2025
Time: 9:45am - 11:30am
Venue: Room 4472
Lifts 25/26
Committee Members: Prof. Fangzhen Lin (Supervisor)
Prof. Dit-Yan Yeung (Chairperson)
Dr. Shuai Wang