More about HKUST
A survey on optimizing the inference efficiency of deep neural networks
PhD Qualifying Examination Title: "A survey on optimizing the inference efficiency of deep neural networks" by Miss Jingzhi FANG Abstract: As deep neural networks (DNNs) have achieved great success in many areas (e.g., computer vision, natural language processing, etc.), reducing the inference execution time of these models becomes an important problem, especially for some latency-critical applications, e.g., autonomous driving, augmented reality, language translation, and so on. Domain experts have designed a lot of handwritten optimization techniques to reduce the DNN inference time without changing their outputs, i.e., on the compilation level, which have been adopted by existing deep learning frameworks (like Tensorflow, PyTorch) and hardware vendor-provided libraries (like cuDNN). However, these optimizations require significant engineering effort, which limits the development and innovation of new models and specialized accelerators. The manually designed techniques may also not cover all the possible cases and miss some optimization opportunities. Besides, directly applying all the possible optimization techniques without holistic optimization often leads to sub-optimal model inference efficiency. Therefore, researchers have been working on the automatic optimization of DNN inference efficiency in recent years. As each DNN can be represented by a computation graph where each node is an operator, like matrix multiplication, existing works can be divided into different groups: optimization on the graph level, on the operator level, and on both levels. This survey reviews works in these categories respectively. Date: Friday, 12 January 2024 Time: 12:00noon - 2:00pm Venue: Room 3494 lifts 25/26 Committee Members: Prof. Lei Chen (Supervisor) Prof. Raymond Wong (Chairperson) Prof. Qiong Luo Prof. Ke Yi **** ALL are Welcome ****