More about HKUST
Zero Copy Transport in Distributed Dataflow Applications
MPhil Thesis Defence Title: "Zero Copy Transport in Distributed Dataflow Applications" By Mr. Bairen YI Abstract Dataflow is a common programming paradigm for processing data in a distributed fashion. When programming dataflow applications, a data processing task is expressed as a dataflow graph, with its vertices as specific operations and its edges as input/output relations or dataflow dependencies between operations. When deployed inside a datacenter in which a large number of processors are available, the dataflow graph is partitioned and placed onto different processors for improved processing throughput. For graph edges that cross the partition boundaries, data chunks need to be transferred between different processors. Increasingly higher data volumn and larger processing power of individual processors often bringing in communication bottleneck onto the inter-processor links, resulting serious performance degradation to the distributed dataflow applications. In recent years, Remote Direct Memory Access (RDMA) becomes widely deployed in data center as an alternative to the Transport Control Protocol (TCP). RDMA offers ultra-low latency and CPU bypass networking to application programmers. Existing applications are often designed around socket based software stack that manages application buffers separately from networking buffers and does memory copies between them when sending and receiving data. With large sized (up to hundreds MB) application buffers, the cost of such copies adds non-trivial overhead to the end-to-end communication pipeline. In this work, we made an attempt to design a zero copy transport for distribute dataflow applications that unifies application and networking buffer management and completely eliminates unnecessary memory copies. Our prototype on top of TensorFlow shows 2.43x performance improvement over gRPC based transport and 1.21x performance improvement over an alternative RDMA transport with private buffers and memory copies. Date: Wednesday, 12 June 2019 Time: 4:00pm - 6:00pm Venue: Room 5501 Lifts 25/26 Committee Members: Dr. Kai Chen (Supervisor) Dr. Ke Yi (Chairperson) Dr. Wei Wang **** ALL are Welcome ****