More about HKUST
Towards High-performance Datacenter Systems with Application-oriented Optimizations
The Hong Kong University of Science and Technology
Department of Computer Science and Engineering
PhD Thesis Defence
Title: "Towards High-performance Datacenter Systems with Application-oriented
Optimizations"
By
Mr. Chaoliang ZENG
Abstract:
In recent decades, we have witnessed extensive construction of datacenters and
widespread deployments of various applications. With the rapid rise of Internet
services and cloud computing but the slowdown of Moore's law and Dennard
scaling, there is a conflict between expanding application requirements and
slow evolutions of general-purpose processors. Therefore, it is critical to
build high-performance datacenter systems with application-oriented
optimizations.
This thesis describes my research efforts in building high-performance
datacenter systems with careful exploitation of application-specific
characteristics and hardware architectures. Specifically, we explore three
application-oriented datacenter systems.
First, we present Herald, a runtime embedding scheduler, for efficient
cache-enabled recommendation model training. Herald fully exploits the
predictability and occasionality of embedding cache access to reduce the
embedding transmissions between caches and PS during training. We believe that
the scheduling philosophy of Herald can be generally extended to the training
of embedding models.
Second, we study the embedding-based retrieval algorithm from the first
principles and derive a practically ideal architecture for optimal performance.
Based on the derived architecture, we propose FAERY for high-performance
embedding-based retrieval running on FPGA. FAERY leverages appropriate parallel
techniques to orchestrate key operators in embedding-based retrieval, so that
FAERY can outperform CPU- and GPU-based approaches. Although FAERY is a
domain-specific accelerator for retrieval in recommendation systems, we believe
similar optimization techniques can be applied to systems bounded by memory and
computation.
Third, we design Tiara, a three-tier hardware architecture to accelerate
stateful layer-4 load balancing. Tiara makes the best use of heterogeneous
hardware by decoupling the load balancing function. As a result, Tiara can
provide high performance with cost, energy, and space efficiency. We believe
Tiara three-tier architecture is generic and can benefit more datacenter
gateway functions.
Date: Tuesday, 18 July 2023
Time: 2:00pm - 4:00pm
Venue: Room 3494
Lifts 25/26
Chairman: Prof. Shiheng WANG (ACCT)
Committee Members: Prof. Kai CHEN (Supervisor)
Prof. Gary CHAN
Prof. Dan XU
Prof. Jun ZHANG (ECE)
Prof. Hong XU (CUHK)
**** ALL are Welcome ****