Efficient DNN Inference via Device-Cloud Collaboration: A Survey

PhD Qualifying Examination


Title: "Efficient DNN Inference via Device-Cloud Collaboration: A Survey"

by

Mr. Jingcan CHEN


Abstract:

As deep neural networks (DNNs) have been widely involved in mobile 
applications, the model deployment paradigm shifts from cloud-centric to 
edge computing, which preserves data privacy and eliminates network delays 
to the cloud server. However, the limited resource capabilities in edge 
devices bring substantial challenges to deploying DNNs on those devices. A 
typical solution is to leverage the external computing resource of cloud 
servers, i.e. device-cloud collaborative deployment. This survey reviews 
recent efforts on collaborative DNN deployment, where both edge devices and 
cloud servers are orchestrated to achieve efficient model inference with 
constrained device resources. We first introduce mainstream computing 
paradigms, the benefits of device-cloud collaboration, and the preliminary 
knowledge of DNNs. We then derive a holistic taxonomy for the 
state-of-the-art optimization technologies that empower device-cloud 
collaboration to boost inference performance, including model splitting, 
early exit and model collaboration Finally, we discuss the future directions 
and open issues. We stress that collaboration paradigm for large language 
models is still in its infancy, and more systematic optimizations are needed 
to support the multi-user scenario in collaborative computation paradigm.


Date:                   Friday, 28 March 2025

Time:                   1:00pm - 3:00pm

Venue:                  Room 5562
                        Lifts 27/28

Committee Members:      Prof. Mo Li (Supervisor)
                        Prof. Song Guo (Chairperson)
                        Dr. Wei Wang