More about HKUST
Efficient Temporal Graph Neural Networks with Theoretical Guarantees
PhD Thesis Proposal Defence
Title: "Efficient Temporal Graph Neural Networks with Theoretical Guarantees"
by
Mr. Yiming LI
Abstract:
Representation learning over dynamic graphs is critical for many real-world
applications such as social network services and recommender systems. Temporal
graph neural networks (T-GNNs) are powerful representation learning methods and
have achieved remarkable effectiveness on continuous-time dynamic graphs.
Unfortunately, T-GNNs continue to face the challenge of high time complexity.
This complexity increases linearly with the number of timestamps and grows
exponentially with the model’s depth, making them impractical for scaling to
large dynamic graphs.
This thesis proposal studies two types of data management techniques, i.e.,
cache replacement and query processing, for improving the efficiency of T-GNNs
with theoretical guarantees. In the first work, we propose Orca, a novel
framework that accelerates TGNN training by non-trivially caching and reusing
intermediate embeddings. We design a brand new and optimal cache replacement
algorithm, called MRD, under a practical cache limit. MRD not only improves the
efficiency of training T-GNNs by maximizing the number of cache hits but also
reduces the approximation errors by avoiding keeping and reusing extremely
stale embeddings. Meanwhile, we develop profound theoretical analysis on the
approximation error introduced by our reuse schemes and offer rigorous
convergence guarantees. Extensive experiments have validated that Orca can
obtain two orders of magnitude speedup over the state-of-the-art baselines
while achieving higher precision on large dynamic graphs. In the second work,
we build the theoretical connection between the temporal message passing scheme
adopted by T-GNNs and the temporal random walk process on dynamic graphs. Our
theoretical analysis indicates that it would be possible to select a few
influential temporal neighbors to compute a target node’s representation
without compromising the predictive performance. Based on this finding, we
propose to utilize T-PPR, a parameterized metric for estimating the influence
score of nodes on evolving graphs. We further develop an efficient single-scan
algorithm to answer the top-k T-PPR query with rigorous approximation
guarantees. Finally, we present Zebra, a scalable framework that accelerates
the computation of T-GNN by directly aggregating the features of the most
prominent temporal neighbors returned by the top-k T-PPR query. Extensive
experiments have validated that Zebra can be up to two orders of magnitude
faster than the state-of-the-art T-GNN while attaining better performance.
Date: Friday, 20 October 2023
Time: 9:00am - 11:00am
Venue: Room 4475
lifts 25/26
Committee Members: Prof. Lei Chen (Supervisor)
Prof. Raymond Wong (Chairperson)
Prof. Ke Yi
Dr. Binhang Yuan
**** ALL are Welcome ****