The Hong Kong University of Science and Technology Department of Computer Science and Engineering PhD Thesis Defence "Full-Text Keyword Search in Meta-search and P2P Networks" By Miss Jing Zhao Abstract Due to the information growth, distributed systems, such as peer-to-peer (P2P) systems or meta-search systems, have become a popular and to some extent revolutionary solution to large-scale data sharing. They offer advantages such as autonomy, flexibility for peers to join and leave the system freely, scalability without the need of powerful and expensive processors, and robustness against single-peer failures. However, the "open nature" of P2P systems and their lack of centralized control pose difficult challenges to full-text search, which can be implemented in centralized systems with powerful search ability and high precision. We study keyword query methods in meta-search and P2P networks. For meta-search, we propose server ranking approaches in which each search engine's document collection is divided into clusters based on the index terms. Furthermore, we keep the term correlation information in a cluster descriptor to improve the server ranking quality. For P2P networks, we propose an efficient and scalable technique to support partial-match queries. A distributed index structure, called the distributed pattern tree (DPTree), is used to record frequent query patterns, i.e., combinations of keywords, learnt from the query history at each node in the network. Using this index, a query can identify its best matching patterns quickly and data lookup can be done in logarithmic time with respect to the network size. Date: Friday, 19 January 2007 Time: 10:00a.m.-12:00noon Venue: Room 3501 Lifts 25-26 Chairman: Prof. Inchi Hu (ISMT) Committee Members: Prof. Dik Lun Lee (Supervisor) Prof. Qiong Luo (Supervisor) Prof. Frederick Lochovsky Prof. Wilfred Ng Prof. Danny Tsang (ECE) Prof. Wang-Chien Lee (Comp. Sci. & Engg., The Pennsylvania State Univ.) **** ALL are Welcome ****