Scaling Graph Neural Network Training on Large Graphs for Effectiveness and Efficiency

PhD Thesis Proposal Defence


Title: "Scaling Graph Neural Network Training on Large Graphs for Effectiveness 
and Efficiency"

by

Miss Jingshu PENG


Abstract:

Graph data has been prevalent with its ability to model many real-life 
applications woven by huge and complex networks of relations and 
interactions. Recently, graph neural networks (GNNs) have come into the 
spotlight in their great success to model the tangled relations in graph 
data, as a powerful machine learning tool. However, the major challenge 
that limits the adoption and deployment of GNNs to large-scale graphs in 
the real world, despite their promising performance, lies in the inability 
to utilize all the data in finite time and the scalability of the 
algorithm itself.

To mitigate the memory requirement, tremendous efforts have been made 
towards sampling-based mini-batch GNN training. Therefore, various 
sampling strategies have been proposed, aiming to improve the training 
efficiency and scalability of current GNN models over large-scale graphs, 
such as node-wise sampling, layer-wise importance sampling, and the 
fundamentally different subgraph sampling. Though existing samplers can 
help GNNs to scale over large-scale graphs, information loss from not 
sufficiently extracting higher-order neighbors inevitably limits the 
expressive power of these shallow architectures. Moreover, the size of the 
graph data and the complexity of the model can easily reach the limit of a 
single device.

To mitigate the memory requirement with ever-increasing graph data and 
model size, distributed GNN processing is the inevitable remedy. Yet, a 
distributed implementation of the GNN learning algorithms, especially an 
efficient one, is nowhere near trivial. Aside from the substantial memory 
footprint, distributed GNNs are memory-intensive as well as 
compute-intensive due to coupled irregular neighborhood access and 
iterative learning procedure. Consequently, both the intensity and the 
volume of data communication and the balance of workload makes distributed 
GNNs more challenging. Some efforts have been made towards distributed GNN 
training, at the price of heavy preprocessing, complex workflow, sampling 
overhead, information loss, and no convergence guarantee.

In this thesis proposal, we investigate the scaling of GNN training on 
large graphs in pursuit of effectiveness and efficiency. First, we propose 
to further advance the effectiveness of sampling-based GNNs in a single 
device environment. We propose an adaptive and structure-aware important 
subgraph sampler to make adaptive GNNs obtain the benefit from 
higher-order neighbor information. Evaluation on 8 benchmark datasets 
demonstrates the competitive performance of our approach to 6 SOTA 
approaches.

Second, we propose to study efficient distributed GNN processing over 
large graphs while preserving the effectiveness. We identify one major 
archenemy of efficient distributed GNN processing is the cross-device data 
communication by virtue of such data movement in GNN aggregation. To avoid 
the communication, we propose to abstract decentralized GNN processing as 
sequential matrix multiplication and uses historical embeddings via cache.

Finally, we conclude this thesis proposal with future directions and 
challenges.


Date:			Thursday, 26 May 2022

Time:                  	2:00pm - 4:00pm

Zoom Meeting:
https://hkust.zoom.us/j/95129881510?pwd=blBkb0hmM0dzM2hYZko2emZ4R2xldz09

Committee Members:	Prof. Lei Chen (Supervisor)
 			Prof. Lionel Ni (Supervisor)
  			Prof. Qiong Luo (Chairperson)
 			Prof. Ke Yi
 			Prof. Xiaofang Zhou


**** ALL are Welcome ****