More about HKUST
Mining Similarity and Causality on Graphs
The Hong Kong University of Science and Technology Department of Computer Science and Engineering PhD Thesis Defence Title: "Mining Similarity and Causality on Graphs" By Mr. Zijian LI Abstract Similarity and causality mining on graphs are common and fundamental requirements for graph-based applications. Specifically, the similarities on graphs can be separated into two categories: the similarity between graphs (i.e., graph similarity), and the similarity between nodes (i.e., node similarity) on a graph. A common graph similarity measure is the Graph Edit Distance (GED). In this work, we propose a probabilistic approach, GBDA, to approximate GED between two graphs according to the Graph Branch Distance (GBD) between them. On the other hand, we propose an efficient index, G*-tree, to compute the node similarity measured by the distance between two nodes, and we propose a shortcut selection algorithm to optimize the efficiency of distance queries on G*-tree. Moreover, we propose a multi-view learning method, TransN, to encode the similarity between nodes into node embeddings. In TransN, we propose an algorithm to capture the proximity information inside every single view. To transfer information across views, we propose an algorithm to translate the node embeddings between different views based on the dual-learning mechanism. The causality on knowledge graphs is represented by the reasoning paths for each fact. The state-of-the-art solution of the reasoning path finding is to train a reinforcement learning agent which selects some reasoning paths for a given fact. However, the existing methods cannot distinguish the quality of different reasoning paths well. To address this problem, in this paper, we model the problem of finding high-quality reasoning paths as an optimization problem on the embedding space, which is NP-hard. Then, we propose a novel approximation method, RETINA, to search the reasoning paths of high quality. To enhance the efficiency of RETINA, we propose a set of lower bounds to prune reasoning paths in the search process. The experiments show that our proposed methods have better efficiency, effectiveness, and scalability than the state-of-the-art methods on real and synthetic data sets. Date: Friday, 17 January 2020 Time: 3:00pm - 5:00pm Venue: Room 5510 Lifts 25/26 Chairman: Prof. Francesco CIUCCI (MAE) Committee Members: Prof. Lei CHEN (Supervisor) Prof. Bo LI Prof. Qiong LUO Prof. Can YANG (MATH) Prof. Pei JIAN (Simon Fraser University) **** ALL are Welcome ****