More about HKUST
Efficient Neural Networks for Image Recognition and Generation on the Edge
The Hong Kong University of Science and Technology Department of Computer Science and Engineering PhD Thesis Defence Title: "Efficient Neural Networks for Image Recognition and Generation on the Edge" By Mr. Jierun CHEN Abstract: Neural networks have prevailed in many fields, with image recognition and generation as prominent examples. However, the rapid growth in model size and complexity has made them increasingly reliant on cloud servers or high-end GPUs. This dependency incurs high operational costs, privacy concerns, and round-trip latency. Deploying models to edge and mobile devices is emerging as a promising solution but often faces challenges such as limited memory, long runtime, and unsatisfactory user experience. In the post-Moore's Law era, hardware advancements alone are insufficient to overcome these challenges, highlighting the growing need for designing efficient neural networks for resource-constrained environments. This thesis addresses these critical efficiency bottlenecks by introducing novel operators and architectures that enhance inference speed and reduce model size across various tasks, ranging from layout-specific applications to general visual tasks, and from image recognition to generation. First, we propose Translation Variant Convolution (TVConv), a novel operator optimized for layout-specific applications, such as face recognition, by leveraging spatial feature variance for efficient region-wise processing. Second, we identify inefficiencies in popular depthwise convolution, such as low compute intensity and frequent memory access, and present Partial Convolution (PConv) to overcome these inefficiencies. Building on this, we develop FasterNet, a family of neural networks that achieves considerably faster running speeds across various devices without sacrificing recognition accuracy. Finally, we introduce SnapGen, a highly compact and fast text-to-image model, capable of generating high-quality, high-resolution images directly and instantly on mobile devices. These contributions collectively advance the democratization of neural networks on edge devices, enabling seamless, cost-effective, and privacy-preserving services accessible anytime, anywhere. Date: Monday, 6 January 2025 Time: 3:30pm - 5:30pm Venue: Room 3494 Lifts 25/26 Chairman: Dr. Tuan Anh NGUYEN (LIFS) Committee Members: Prof. Gary CHAN (Supervisor) Prof. Qiong LUO Dr. Dan XU Dr. Jun ZHANG (ECE) Dr. Lai Man PO (CityU)