Recent Advances in Graph Partitioning for Increasing the Performance of Large-Scale Distributed Graph Processing

Speaker: Professor Hans-Arno Jacobsen
         University of Toronto

Title: "Recent Advances in Graph Partitioning for Increasing the
        Performance of Large-Scale Distributed Graph Processing"

Date:   Wednesday, 31 May 2023

Time:   4:00pm - 5:00pm

Venue:  Room 5508 (via lift 25/26), HKUST

Abstract:

Graph-structured data is found in various domains such as social networks,
websites, and recommendation networks. To analyze large graphs and gain
high-level insights, distributed graph processing frameworks such as
Spark/GraphX and Giraph have been established. For distributed processing,
the graph needs to be split into multiple partitions, while the cut size
and balancing of the partitions need to be optimized. This problem is
known as graph partitioning. In this talk, I will summarize recent
advances of graph partitioning and introduce important new concepts that
have been developed in my group. First, novel techniques that reduce the
memory footprint of graph partitioning while maintaining a high
partitioning quality: Hybrid Edge Partitioning and Two-Phase Streaming.
Second, EASE, a framework for optimizing the choice of partitioning
technique for a given graph and processing algorithm. EASE is based on
machine learning and achieves better performance than a manual partitioner
selection based on heuristics.


*********************
Biography:

Hans-Arno~Jacobsen holds the Jeffrey Skoll Chair in Computer Networking
and Innovation at the Sr. Rogers Department of Electrical and Computer
Engineering, University of Toronto, where he is a professor of Computer
Engineering and Computer Science. His pioneering research lies at the
intersection of distributed systems, data management and data science,
with particular focus on blockchains, (complex) event processing, and
cyber-physical systems. Most recently, he has become interested in quantum
computing where, to this end, he is working on applications in molecular
property prediction (computational chemistry) and quantum machine
learning, in the long-term, aiming to endeavor into building distributed
quantum computing abstractions. Arno is a Fellow of the IEEE.