Reducing the Ever-growing Cost of Machine Learning Services

Speaker: Dr. Binhang Yuan
         ETH Zurich

Title:  "Reducing the Ever-growing Cost of Machine Learning Services"

Date:   Monday, 20 February 2023 (Revised)

Time:   3:00pm - 4:00pm HKT

Zoom link:
https://hkust.zoom.us/j/465698645?pwd=aVRaNWs2RHNFcXpnWGlkR05wTTk3UT09

Meeting ID: 465 698 645
Passcode: 20222023


Abstract:

The recent success of machine learning (ML) has dramatically benefited
from the exponential growth of ML model capacity. However, the enormous
capacity of ML models also leads to a significantly higher cost. In
practice, the high cost of ML comes from three sources: i) the cost of
optimizing/deploying ML services over the ever-changing hardware; ii) the
low utilization of the hardware due to parallel/distributed communication
overhead; and iii) the high cost of accessing the hardware. My work
attempts to reduce the cost in all three categories above: Developing and
deploying ML workflows in ever-changing execution environments is a
tedious and time-consuming job and would require a significant amount of
engineering effort to scale out the computation; my work proposes new
abstractions for ML system design and implementation with expressivity,
easy optimization, and high performance. In parallel/distributed ML
training, communication is usually the main bottleneck that restricts
hardware efficiency; my work explores system relaxations of communications
under different parallel ML training paradigms to increase hardware
efficiency without compromising their statistical efficiency. Given the
advances in system optimization and relaxation, my work further
investigates how to deploy the ML service over a decentralized open
collective environment consisting of much cheaper and underutilized
decentralized GPUs; the result is promising: when the decentralized
interconnections are 100X slower than the data center network, under
efficient scheduling, the end-to-end training throughput is only 1.7~3.5X
slower than the state-of-the-art solutions inside a data center.


*******************
Biography:

Binhang Yuan is a postdoc research scientist in the Department of Computer
Science at ETH Zurich under the supervision of Ce Zhang. He received his
Bachelor of Science (2013) in Computer Science from Fudan University, and
his Master of Science (2016) and Ph.D. (2020) in Computer Science from
Rice University advised by Chris Jermaine. Binhang's research interests
lie in the areas of data management for machine learning and
distributed/decentralized machine learning systems. His work won the Best
Paper Honorable Mention Award in VLDB 2019 and the Research Highlight
Award in SIGMOD 2020.