A survey on optimizing the inference efficiency of deep neural networks

PhD Qualifying Examination


Title: "A survey on optimizing the inference efficiency of deep neural
networks"

by

Miss Jingzhi FANG


Abstract:

As deep neural networks (DNNs) have achieved great success in many areas (e.g.,
computer vision, natural language processing, etc.), reducing the inference
execution time of these models becomes an important problem, especially for
some latency-critical applications, e.g., autonomous driving, augmented
reality, language translation, and so on. Domain experts have designed a lot of
handwritten optimization techniques to reduce the DNN inference time without
changing their outputs, i.e., on the compilation level, which have been adopted
by existing deep learning frameworks (like Tensorflow, PyTorch) and hardware
vendor-provided libraries (like cuDNN). However, these optimizations require
significant engineering effort, which limits the development and innovation of
new models and specialized accelerators. The manually designed techniques may
also not cover all the possible cases and miss some optimization opportunities.
Besides, directly applying all the possible optimization techniques without
holistic optimization often leads to sub-optimal model inference efficiency.
Therefore, researchers have been working on the automatic optimization of DNN
inference efficiency in recent years. As each DNN can be represented by a
computation graph where each node is an operator, like matrix multiplication,
existing works can be divided into different groups: optimization on the graph
level, on the operator level, and on both levels. This survey reviews works in
these categories respectively.


Date:                   Friday, 12 January 2024

Time:                   12:00noon - 2:00pm

Venue:                  Room 3494
                        lifts 25/26

Committee Members:      Prof. Lei Chen (Supervisor)
                        Prof. Raymond Wong (Chairperson)
                        Prof. Qiong Luo
                        Prof. Ke Yi


**** ALL are Welcome ****
Privacy Sitemap
A survey on optimizing the inference efficiency of deep neural networks

About

People

Research

Academics

Admissions