Hyperscale Data Processing with Network-centric Designs

Speaker: Qizhen ZHANG
         University of Pennsylvania

Title:   "Hyperscale Data Processing with Network-centric Designs"

Date:    Thursday, 17 February 2022

Time:    10:00am - 11:00am (HKT)

Zoom link:
https://hkust.zoom.us/j/928308079?pwd=MW9wTCtlSDd2MnViZGdNd2oreUpXZz09

Meeting ID:     928 308 079
Passcode:       20212022

Abstract:

Today's largest data processing workloads are hosted in cloud data
centers. Due to exponential data growth and the end of Moore's Law, these
workloads have ballooned to hyperscale, encompassing billions to trillions
of data items and hundreds to thousands of servers per query. Enabling and
expanding with hyperscale data processing are highly scalable data center
networks. Hyperscale fundamentally challenges the designs of data
processing systems and data center networks. My research rethinks the
interactions between these two layers and seeks the optimal solutions for
supporting data processing in data centers and evolving the cloud
infrastructure.

In this talk, I will present network-centric designs, a principled and
cross-layer approach to building systems for hyperscale. It concerns the
performance of data processing in both current networks and future
networks, as well as how networks evolve. To demonstrate the efficiency of
this approach, I will first discuss GraphRex, which combines classic
database and systems techniques to push the performance of massive graph
queries in current data centers. I will then introduce data processing in
disaggregated data centers (DDCs), a promising new cloud proposal. In
particular, I will detail TELEPORT, a system that allows data processing
systems to unlock all DDC benefits. Finally, I will also show MimicNet,
which facilitates network innovation at scale.

****************
Biography:

Qizhen Zhang is a Ph.D. candidate in the Department of Computer and
Information Science at the University of Pennsylvania, advised by Vincent
Liu and Boon Thau Loo. His dissertation research bridges cloud data
processing systems and data center networks to address emerging challenges
in hyperscale data processing. He is broadly interested in data management
and computer systems and networking, and he researches across the data
processing stack. His work appears at database and systems conferences
such as SIGMOD, VLDB, and SIGCOMM.