More about HKUST
Efficient Algorithms for SimRank
PhD Thesis Proposal Defence
Title: "Efficient Algorithms for SimRank"
by
Mr. Yue WANG
Abstract:
Measuring similarities among different nodes is a fundamental problem in graph
analysis. Among different similarity measurements, SimRank is one of the most
promising and popular. Due to the huge amount of data generated by the
activities of people every data, today's graphs are often big and time
evolving, examples include on-line social networks and on-line shopping
platforms. This challenges current algorithms for SimRank computation w.r.t.
efficiency, scalability and quality of results. In the thesis, we first study
the problem of all-pairs SimRank computation. Observing current iterative
methods for all-pairs SimRank are not efficient in time and space, due to
unnecessary cost and storage by the nature of iterative updating, we propose a
local push based algorithm, which has the property that not all SimRank scores
are involved in the computation. The push-on-demand schema can reduce a lot of
unnecessary cost, and has an accuracy guarantee. We further extend our
algorithm to track accurate SimRank scores in dynamic graphs, which can address
the accuracy issue of current incremental solutions. We then study the pairwise
SimRank estimation problem, observing that current single-pair SimRank
solutions are either static or inefficient in handling dynamic cases with
good-quality results, we propose three algorithms to query pairwise SimRank
over static and dynamic graphs efficiently, by using different sample reduction
strategies. The accuracy of our algorithms is guaranteed by the different
invariants we propose for pairwise SimRank. Finally, we study the problem
finding similar pairs given a set of node pairs with SimRank, which has
attractive applications in personalized search and recommendation tasks. We
present an efficient framework for retrieving the top-k similarities from an
arbitrary set of pairs. In addition, we introduce two types of indexes to boost
the efficiency, one is hub-based, the other is tree-based.
Date: Thursday, 29 November 2018
Time: 2:00pm - 4:00pm
Venue: Room 3494
(lifts 25/26)
Committee Members: Prof. Lei Chen (Supervisor)
Dr. Ke Yi (Chairperson)
Dr. Qiong Luo
Dr. Wei Wang
**** ALL are Welcome ****