More about HKUST
Effective Keyword Search on Large Scale Graphs in a Distributed System
MPhil Thesis Defence Title: "Effective Keyword Search on Large Scale Graphs in a Distributed System" By Miss Mengyu LI Abstract Keyword query is more user-friendly than structured query (SQL, XQuery, SPARQL) as it requires no pre-knowledge about the underlying schema of the database. In the recent decade there has been much work focusing on the keyword search in the structured databases such as relational databases, and unstructured databases such as XML and HTML databases. Particularly, since the labeled graph model can generally present both the structured and unstructured databases, how to conduct efficient search over graph has been a hot topic in the database community. Furthermore, with the rapid development of semantic web, the graph data has grown to a huge scale and has to be stored and processed in a distributed environment. However, no existing approach can be directly applied to answer the keyword queries in distributed graphs. The challenges exist in twofold, the poor locality exists in the keyword query answers may lead to high communication cost for graph exploration and the huge traffic load for shifting the data among different machines. To address these two challenges, in this thesis, we first propose an adaptive index construction method which optimizes the index to minimal distributed query processing cost within the space budget on each machine; Then, based on this indexing scheme we introduce an advanced distributed query processing algorithm which includes an optimal scheduling problem defined to reduce traffic and the time cost, and we propose a greedy algorithm with a factor-2 approximation to the optimal schedule problem; At last, we verify the effectiveness of our solutions with large scale real datasets. Date: Tuesday, 18 June 2013 Time: 2:00pm – 4:00pm Venue: Room 3501 Lifts 25/26 Committee Members: Dr. Lei Chen (Supervisor) Dr. Ke Yi (Chairperson) Dr. Raymond Wong **** ALL are Welcome ****