More about HKUST
Accelerating In-Memory Subgraph Matching on a Single Machine
The Hong Kong University of Science and Technology Department of Computer Science and Engineering PhD Thesis Defence Title: "Accelerating In-Memory Subgraph Matching on a Single Machine" By Mr. Shixuan SUN Abstract Graph data are ubiquitous in practice, and subgraph matching is a basic operation in graph analytics. As modern computers typically have sufficient processing power and memory resources, we study how to accelerate in-memory subgraph matching on a single computer. Our approach is through algorithmic optimization and parallelization. First, we design PSM, a parallel algorithmic framework for backtracking-based subgraph matching. Specifically, we abstract the matching process as a depth-first search (DFS) tree, and parallelize the search by dividing up the tree into search regions and exploring these regions in parallel. We design an effective dynamic load balance strategy for the search. As a result, the PSM-style algorithms achieved a speedup of 15X-19X over their sequential counterparts on a twenty-core machine. Second, we develop LIGHT, a parallel subgraph matching algorithm for unlabeled graphs. The search space in unlabeled subgraph matching, or called subgraph enumeration, is big due to the absence of vertex labels. Our approach to reducing the search space and thus redundant computation is to defer the materialization of intermediate results and to increase their reuse. Furthermore, we parallelize our algorithm with vector and simultaneous multithreading instructions in modern computers. Our results show that LIGHT significantly outperforms the state of the art. Finally, we design vcFV, an efficient framework for finding all data graphs in a graph database each of which contains the given query graph. Existing work takes the indexing-filtering-verification approach to evaluate queries. However, the indexing phase suffers serious scalability problems. Our framework utilizes the leading subgraph matching algorithms to answer queries, which removes the indexing phase. The experiment results show that our framework is competitive in time performance and can scale to hundreds of thousands of data graphs and graphs of thousands of vertices. Date: Monday, 10 February 2020 Time: 5:00pm - 7:00pm Zoom Meeting: https://hkust.zoom.us/j/396223450 Chairman: Prof. Beifang CHEN (MATH) Committee Members: Prof. Qiong LUO (Supervisor) Prof. Raymond WONG Prof. Ke YI Prof. Weichuan YU (ECE) Prof. James CHENG (CUHK) **** ALL are Welcome ****