More about HKUST
Efficient DNN Inference via Device-Cloud Collaboration: A Survey
PhD Qualifying Examination Title: "Efficient DNN Inference via Device-Cloud Collaboration: A Survey" by Mr. Jingcan CHEN Abstract: As deep neural networks (DNNs) have been widely involved in mobile applications, the model deployment paradigm shifts from cloud-centric to edge computing, which preserves data privacy and eliminates network delays to the cloud server. However, the limited resource capabilities in edge devices bring substantial challenges to deploying DNNs on those devices. A typical solution is to leverage the external computing resource of cloud servers, i.e. device-cloud collaborative deployment. This survey reviews recent efforts on collaborative DNN deployment, where both edge devices and cloud servers are orchestrated to achieve efficient model inference with constrained device resources. We first introduce mainstream computing paradigms, the benefits of device-cloud collaboration, and the preliminary knowledge of DNNs. We then derive a holistic taxonomy for the state-of-the-art optimization technologies that empower device-cloud collaboration to boost inference performance, including model splitting, early exit and model collaboration Finally, we discuss the future directions and open issues. We stress that collaboration paradigm for large language models is still in its infancy, and more systematic optimizations are needed to support the multi-user scenario in collaborative computation paradigm. Date: Friday, 28 March 2025 Time: 1:00pm - 3:00pm Venue: Room 5562 Lifts 27/28 Committee Members: Prof. Mo Li (Supervisor) Prof. Song Guo (Chairperson) Dr. Wei Wang