More about HKUST
Efficient Temporal Graph Neural Networks with Theoretical Guarantees
PhD Thesis Proposal Defence Title: "Efficient Temporal Graph Neural Networks with Theoretical Guarantees" by Mr. Yiming LI Abstract: Representation learning over dynamic graphs is critical for many real-world applications such as social network services and recommender systems. Temporal graph neural networks (T-GNNs) are powerful representation learning methods and have achieved remarkable effectiveness on continuous-time dynamic graphs. Unfortunately, T-GNNs continue to face the challenge of high time complexity. This complexity increases linearly with the number of timestamps and grows exponentially with the model’s depth, making them impractical for scaling to large dynamic graphs. This thesis proposal studies two types of data management techniques, i.e., cache replacement and query processing, for improving the efficiency of T-GNNs with theoretical guarantees. In the first work, we propose Orca, a novel framework that accelerates TGNN training by non-trivially caching and reusing intermediate embeddings. We design a brand new and optimal cache replacement algorithm, called MRD, under a practical cache limit. MRD not only improves the efficiency of training T-GNNs by maximizing the number of cache hits but also reduces the approximation errors by avoiding keeping and reusing extremely stale embeddings. Meanwhile, we develop profound theoretical analysis on the approximation error introduced by our reuse schemes and offer rigorous convergence guarantees. Extensive experiments have validated that Orca can obtain two orders of magnitude speedup over the state-of-the-art baselines while achieving higher precision on large dynamic graphs. In the second work, we build the theoretical connection between the temporal message passing scheme adopted by T-GNNs and the temporal random walk process on dynamic graphs. Our theoretical analysis indicates that it would be possible to select a few influential temporal neighbors to compute a target node’s representation without compromising the predictive performance. Based on this finding, we propose to utilize T-PPR, a parameterized metric for estimating the influence score of nodes on evolving graphs. We further develop an efficient single-scan algorithm to answer the top-k T-PPR query with rigorous approximation guarantees. Finally, we present Zebra, a scalable framework that accelerates the computation of T-GNN by directly aggregating the features of the most prominent temporal neighbors returned by the top-k T-PPR query. Extensive experiments have validated that Zebra can be up to two orders of magnitude faster than the state-of-the-art T-GNN while attaining better performance. Date: Friday, 20 October 2023 Time: 9:00am - 11:00am Venue: Room 4475 lifts 25/26 Committee Members: Prof. Lei Chen (Supervisor) Prof. Raymond Wong (Chairperson) Prof. Ke Yi Dr. Binhang Yuan **** ALL are Welcome ****