More about HKUST
Efficient Neural Networks for Image Recognition and Generation
PhD Thesis Proposal Defence
Title: "Efficient Neural Networks for Image Recognition and Generation"
by
Mr. Jierun CHEN
Abstract:
Over the past decade, neural networks have prevailed in many fields, with image
recognition and generation as prominent examples. However, their rapid growth
in model size and complexity has outpaced the slowing of Moore's Law,
restricting their deployment to cloud-based servers or high-performance GPUs.
This dependency incurs significant operational costs, round-trip latency,
reliance on internet connectivity, and privacy concerns due to data
transmission to third parties. In resource-constrained environments, such as
mobile and edge devices, these models also struggle with memory limitations,
reduced processing speed, and poor user experience. Designing efficient neural
networks is therefore essential to overcoming these challenges, democratizing
neural networks, and unlocking their broader applications.
This thesis addresses the critical need for more efficient neural networks by
introducing novel operators and architectures that enhance inference speed,
reduce model size, and optimize memory usage across various tasks, from
layout-specific to general visual applications, and from image recognition to
generation. First, we propose Translation Variant Convolution (TVConv), a novel
operator tailored for layout-specific tasks like face recognition, that
leverages spatial feature variance to enable efficient region-wise processing.
Next, we identify inefficiencies in widely used operators, such as low compute
intensity and frequent memory access, and present Partial Convolution (PConv)
overcoming those inefficiencies. Building on this, we propose FasterNet, a
family of neural networks that delivers considerably faster running speeds
across multiple devices without sacrificing accuracy. Finally, We develop
EfficientGen, a portable, cost-effective diffusion model for text-to-image
generation on mobile devices, supporting high resolutions, flexible aspect
ratios, and producing superior visual quality in under a second.
Date: Friday, 18 October 2024
Time: 3:00pm - 5:00pm
Venue: Room 5501
Lifts 25/26
Committee Members: Prof. Gary Chan (Supervisor)
Prof. Raymond Wong (Chairperson)
Prof. Pedro Sander
Dr. Dan Xu