More about HKUST
EFFICIENT CORRELATED PATTERN DISCOVERY IN DATABASES
The Hong Kong University of Science and Technology Department of Computer Science and Engineering PhD Thesis Defence Title: "EFFICIENT CORRELATED PATTERN DISCOVERY IN DATABASES" By Miss Yiping Ke Abstract Correlation mining has gained great success in many application domains for its ability to capture the underlying dependency between objects. However, existing research on correlation mining is mainly conducted on boolean databases, despite that other complex data, especially in various scientific and business domains, proliferates in recent years. In this thesis, we study the correlated pattern discovery from two types of prevalently-used databases: quantitative databases and graph databases. In mining correlations from quantitative databases, we propose a novel notion of Quantitative Correlated Patterns (QCPs), which is founded on two correlation measures, normalized mutual information and all-confidence. We also develop an algorithm to efficiently mine QCPs by utilizing a supervised interval combining method and performing bi-level pruning. In mining graph databases, we formalize a new problem of Correlated Graph Search (CGS) using Pearson's correlation coefficient as a correlation measure. We devise an efficient algorithm that solves the CGS problem by mining the candidates from a much smaller projected database. We also make use of the theoretical bounds on the support of a candidate graph to directly answer high-support queries without mining the candidates. The experimental results on both real and synthetic datasets justify the efficiency and effectiveness of our proposed solutions. Date: Friday, 18 January 2008 Time: 10:00a.m.-12:00noon Venue: Room 4480 Lifts 25-26 Chairman: Prof. Kun Xu (MATH) Committee Members: Prof. Wilfred Ng (Supervisor) Prof. Dik Lun Lee Prof. Ke Yi Prof. Oscar Au (ECE) Prof. Xindong Wu (Comp. Sci., Univ. of Vermont) **** ALL are Welcome ****