More about HKUST
Efficient Neural Networks for Image Recognition and Generation on the Edge
The Hong Kong University of Science and Technology
Department of Computer Science and Engineering
PhD Thesis Defence
Title: "Efficient Neural Networks for Image Recognition and Generation on the
Edge"
By
Mr. Jierun CHEN
Abstract:
Neural networks have prevailed in many fields, with image recognition and
generation as prominent examples. However, the rapid growth in model size and
complexity has made them increasingly reliant on cloud servers or high-end
GPUs. This dependency incurs high operational costs, privacy concerns, and
round-trip latency. Deploying models to edge and mobile devices is emerging
as a promising solution but often faces challenges such as limited memory,
long runtime, and unsatisfactory user experience. In the post-Moore's Law
era, hardware advancements alone are insufficient to overcome these
challenges, highlighting the growing need for designing efficient neural
networks for resource-constrained environments.
This thesis addresses these critical efficiency bottlenecks by introducing
novel operators and architectures that enhance inference speed and reduce
model size across various tasks, ranging from layout-specific applications to
general visual tasks, and from image recognition to generation. First, we
propose Translation Variant Convolution (TVConv), a novel operator optimized
for layout-specific applications, such as face recognition, by leveraging
spatial feature variance for efficient region-wise processing. Second, we
identify inefficiencies in popular depthwise convolution, such as low compute
intensity and frequent memory access, and present Partial Convolution (PConv)
to overcome these inefficiencies. Building on this, we develop FasterNet, a
family of neural networks that achieves considerably faster running speeds
across various devices without sacrificing recognition accuracy. Finally, we
introduce SnapGen, a highly compact and fast text-to-image model, capable of
generating high-quality, high-resolution images directly and instantly on
mobile devices. These contributions collectively advance the democratization
of neural networks on edge devices, enabling seamless, cost-effective, and
privacy-preserving services accessible anytime, anywhere.
Date: Monday, 6 January 2025
Time: 3:30pm - 5:30pm
Venue: Room 3494
Lifts 25/26
Chairman: Dr. Tuan Anh NGUYEN (LIFS)
Committee Members: Prof. Gary CHAN (Supervisor)
Prof. Qiong LUO
Dr. Dan XU
Dr. Jun ZHANG (ECE)
Dr. Lai Man PO (CityU)