More about HKUST
Clustering Effectiveness and Efficiency of Community-based Web Search
The Hong Kong University of Science and Technology
Department of Computer Science and Engineering
Title: "Clustering Effectiveness and Efficiency of Community-based Web
Search"
by
Mr. XU Teng
Abstract
Many collaborative web search methods are based on user profiling and
clustering on user communities. Among the numerous graph models, the
Community Clickthrough Model (CCM) captures users' conceptual preference
and thus outperforms other content-ignorant models. The corresponding
clustering algorithm on CCM, Community-based Agglomerative Divisive
Clustering (CADC) algorithm allows incremental clustering of the
clickthrough data. In this thesis, we discover the limitations of CCM and
CADC and develop enhancements to alleviate the limitations. By adding the
page miniature, URL type, to form a quad-partite graph, we propose the
Enhanced Community Clickthrough Model (E-CCM) that can capture the users'
interests more precisely compared to CCM. We also develop a refinement on
CADC, called CADCInc, which supports incremental update and hence maintain
a high efficiency even when the volume of data is very large. Experiments
show that our proposed E-CCM model has a significant effectiveness gain
compared to the original model, and that our CADCInc algorithm can
maintain a bounded processing time for large amount of clickthrough data
with only a very small tradeoff in clustering quality.
Date : 13 May 2011 (Friday)
Time : 1:30pm to 2:10pm
Venue : 3402 (17-18 lift)
Advisor : Prof. Dik Lee
2nd reader : Dr. Wilfred Ng