Mining Similarities on Homogeneous and Heterogeneous Graphs

PhD Thesis Proposal Defence


Title: "Mining Similarities on Homogeneous and Heterogeneous Graphs"

by

Mr. Zijian LI


Abstract:

Mining similarities on graphs is a common and fundamental requirement on 
graph-based applications. The similarities on graphs can be generally separated 
into two categories: the similarity between graphs (i.e., graph similarity), 
and the similarity between nodes (i.e., node similarity) on a graph.

For homogeneous graphs, a popular graph similarity measure is the Graph Edit 
Distance (GED). Since exact GED computation is NP-hard, we propose a 
probabilistic approach to estimate GED between two graphs efficiently. In our 
method, we first introduce a novel graph similarity measure Graph Branch 
Distance (GBD), which can be efficiently computed in polynomial time. Then, we 
formulate the relationship between GED and GBD by considering branch variations 
as the result ascribed to graph edit operations, and model this process by 
probabilistic approaches. Moreover, we propose an efficient index G*-tree to 
compute the node similarity modeled by the distance between two nodes. G*-tree 
is a hierarchical index based on graph partitions whose root node represents 
the entire graph, and each node in G*-tree represents a subgraph of its parent 
node. Moreover, we build shortcuts between selected leaf nodes to optimize the 
efficiency of distance queries on G*-tree.

To compute node similarities on heterogeneous graphs, we propose a multi-view 
learning method TransN to encode the similarity between nodes into node 
embeddings. In TransN, we propose an algorithm to capture the proximity 
information inside every single view. To transfer information across views, we 
propose an algorithm to translate the node embeddings between different views 
based on the dual-learning mechanism, which can both capture the complex 
relations between node embeddings in different views, and preserve the 
proximity information inside each view during the translation.

The experiments show that our proposed methods have better efficiency, 
effectiveness, and scalability in the graph similarity search tasks than the 
state-of-the-art methods on real and synthetic data sets.


Date:			Monday, 18 November 2019

Time:                  	4:30pm - 6:30pm

Venue:                  Room 3494
                         lifts 25/26

Committee Members:	Prof. Lei Chen (Supervisor)
 			Dr. Raymond Wong (Chairperson)
 			Dr. Qiong Luo
 			Dr. Xiaojuan Ma


**** ALL are Welcome ****