More about HKUST
Mining Similarities on Homogeneous and Heterogeneous Graphs
PhD Thesis Proposal Defence Title: "Mining Similarities on Homogeneous and Heterogeneous Graphs" by Mr. Zijian LI Abstract: Mining similarities on graphs is a common and fundamental requirement on graph-based applications. The similarities on graphs can be generally separated into two categories: the similarity between graphs (i.e., graph similarity), and the similarity between nodes (i.e., node similarity) on a graph. For homogeneous graphs, a popular graph similarity measure is the Graph Edit Distance (GED). Since exact GED computation is NP-hard, we propose a probabilistic approach to estimate GED between two graphs efficiently. In our method, we first introduce a novel graph similarity measure Graph Branch Distance (GBD), which can be efficiently computed in polynomial time. Then, we formulate the relationship between GED and GBD by considering branch variations as the result ascribed to graph edit operations, and model this process by probabilistic approaches. Moreover, we propose an efficient index G*-tree to compute the node similarity modeled by the distance between two nodes. G*-tree is a hierarchical index based on graph partitions whose root node represents the entire graph, and each node in G*-tree represents a subgraph of its parent node. Moreover, we build shortcuts between selected leaf nodes to optimize the efficiency of distance queries on G*-tree. To compute node similarities on heterogeneous graphs, we propose a multi-view learning method TransN to encode the similarity between nodes into node embeddings. In TransN, we propose an algorithm to capture the proximity information inside every single view. To transfer information across views, we propose an algorithm to translate the node embeddings between different views based on the dual-learning mechanism, which can both capture the complex relations between node embeddings in different views, and preserve the proximity information inside each view during the translation. The experiments show that our proposed methods have better efficiency, effectiveness, and scalability in the graph similarity search tasks than the state-of-the-art methods on real and synthetic data sets. Date: Monday, 18 November 2019 Time: 4:30pm - 6:30pm Venue: Room 3494 lifts 25/26 Committee Members: Prof. Lei Chen (Supervisor) Dr. Raymond Wong (Chairperson) Dr. Qiong Luo Dr. Xiaojuan Ma **** ALL are Welcome ****