Efficient Temporal Graph Neural Networks with Theoretical Guarantees

PhD Thesis Proposal Defence


Title: "Efficient Temporal Graph Neural Networks with Theoretical Guarantees"

by

Mr. Yiming LI


Abstract:

Representation learning over dynamic graphs is critical for many real-world 
applications such as social network services and recommender systems. Temporal 
graph neural networks (T-GNNs) are powerful representation learning methods and 
have achieved remarkable effectiveness on continuous-time dynamic graphs. 
Unfortunately, T-GNNs continue to face the challenge of high time complexity. 
This complexity increases linearly with the number of timestamps and grows 
exponentially with the model’s depth, making them impractical for scaling to 
large dynamic graphs.

This thesis proposal studies two types of data management techniques, i.e., 
cache replacement and query processing, for improving the efficiency of T-GNNs 
with theoretical guarantees. In the first work, we propose Orca, a novel 
framework that accelerates TGNN training by non-trivially caching and reusing 
intermediate embeddings. We design a brand new and optimal cache replacement 
algorithm, called MRD, under a practical cache limit. MRD not only improves the 
efficiency of training T-GNNs by maximizing the number of cache hits but also 
reduces the approximation errors by avoiding keeping and reusing extremely 
stale embeddings. Meanwhile, we develop profound theoretical analysis on the 
approximation error introduced by our reuse schemes and offer rigorous 
convergence guarantees. Extensive experiments have validated that Orca can 
obtain two orders of magnitude speedup over the state-of-the-art baselines 
while achieving higher precision on large dynamic graphs. In the second work, 
we build the theoretical connection between the temporal message passing scheme 
adopted by T-GNNs and the temporal random walk process on dynamic graphs. Our 
theoretical analysis indicates that it would be possible to select a few 
influential temporal neighbors to compute a target node’s representation 
without compromising the predictive performance. Based on this finding, we 
propose to utilize T-PPR, a parameterized metric for estimating the influence 
score of nodes on evolving graphs. We further develop an efficient single-scan 
algorithm to answer the top-k T-PPR query with rigorous approximation 
guarantees. Finally, we present Zebra, a scalable framework that accelerates 
the computation of T-GNN by directly aggregating the features of the most 
prominent temporal neighbors returned by the top-k T-PPR query. Extensive 
experiments have validated that Zebra can be up to two orders of magnitude 
faster than the state-of-the-art T-GNN while attaining better performance.


Date:			Friday, 20 October 2023

Time:                  	9:00am - 11:00am

Venue:                  Room 4475
                         lifts 25/26

Committee Members:	Prof. Lei Chen (Supervisor)
 			Prof. Raymond Wong (Chairperson)
 			Prof. Ke Yi
 			Dr. Binhang Yuan


**** ALL are Welcome ****