On the Fairness of Network Resource Allocation for Data-Parallel Applications

The Hong Kong University of Science and Technology
Department of Computer Science and Engineering


PhD Thesis Defence


Title: "On the Fairness of Network Resource Allocation for Data-Parallel 
Applications"

By

Mr. Shiyao MA


Abstract


Fair allocation of network resources for data-parallel applications
is a challenging undertaking. For one thing, the conflict between the
increasing volume of communications and limited link bandwidth is
becoming growingly intense due to the popularization of big data.
Moreover, the distributed nature of data-parallel tasks exhibits a
correlated traffic pattern where a job is considered completed only
when the coflow---flows of all its constituent tasks---has
finished,hence rendering schemes on per-flow level fairness
inapplicable. In face of these challenges, this thesis presents a
systematic study to ensure the progress of network communications
confronting data-parallel applications.

Our first insight is that, data locality should be exploited to
reduce network transfers, thus accelerating application progress and
alleviating network contention. This is of critical importance to
data-processing applications such as Hadoop and Spark, which spend a
huge amount of time reading input blocks scattered on data servers. We
propose Custody, a cluster management framework that transparently
retrieves locality information of input data blocks and allocates
machines with local data to applications in a fair fashion by solving
the data-aware resource sharing problem.

Even with data locality in hand, network transfers are still
inevitable and are oftentimes enormous, e.g.,the shuffling phase in
services such as web search, video analytics and graph processing.
Therefore, network isolation should be provided so that the worst case
performance of each service is assured. We observe that such an
isolation guarantee can be maximized by careful placement of tasks. A
two-step allocation scheme is proposed where we first coordinate the
placement of tasks based on access link status and bandwidth demands
of each application, and then enforce the bandwidth allocation of
tasks within an application.

While per-application network isolation is an ideal persuit that
ensures the progress of each application, it nonetheless drags down
the overall performance. This situation becomes more severe when they
are carried out under hard deadline requirements. Our next endeavor is
to share the network links in a fair fashion so as to meet the
deadlines of as many applications as possile. Existing flow-level
scheduling schemes are insufficient to guarantee the coflow-level
application performance since a coflow can meet its deadline only when
all its constituent flows finish on time. We present Chronos, a
scheduling framework that captures the correlation of flows belonging
to the same coflow, and allocates network resource among multiple
concurrent coflows with deadline in mind.


Date:			Monday, 20 August 2018

Time:			2:30pm - 4:30pm

Venue:			Room 5504
 			Lifts 25/26

Chairman:		Prof. Patrick Yue (ECE)

Committee Members:	Prof. Bo Li (Supervisor)
 			Prof. Qiong Luo
 			Prof. Ke Yi
 			Prof. Chin-Tau Lea (ECE)
 			Prof. Chuan Wu (COMP, HKU)


**** ALL are Welcome ****