More about HKUST
CO-LOCATION PATTERN DISCOVERY
PhD Thesis Proposal Defence Title: "CO-LOCATION PATTERN DISCOVERY" by Miss Xiangye Xiao Abstract: Co-location pattern discovery is to find classes of spatial objects that are frequently located together. For example, if two categories of businesses often locate together, they might be identified as a co-location pattern; if several biologic species frequently live in nearby places, they might be a co-location pattern. There are a lot of spatial data in real world, from which we can find co-location patterns, such as GPS logs, yellow pages, and map search logs. Co-location patterns have many useful applications. For example, the co-located query patterns, which are subsets of queries often searched for close target locations, can be used in location sensitive query suggestion, Point of interest recommendation, and local advertising. With the purpose of mining co-location patterns in real data, we find existing approaches have two problems. First, they only find global co-location patterns. The regional co-location patterns they miss are also interesting and potentially useful. Second, they are not scalable to large data sets due to huge number of candidates and expensive instance generation. In order address these problems, we propose three co-location pattern discovery approaches. First, we propose a lattice based co-location pattern discovery approach (LatticeCLPMiner). This approach can find both regional and global co-location patterns, distinguish regional patterns from global co-location ones, and find the applicable areas of regional patterns. Second, we propose a density based approach (DenseCLPMiner) to speedup co-location pattern mining. DenseCLPMiner utilizes the non-uniform distribution of spatial objects. It processes the dense areas first to generate instances of candidates, maintains the upper bounds of the prevalence of candidates using the generated instances in already processed partitions, and prune the candidates in the event that their prevalence upper bounds fall below a threshold. As a result, the overall cost of instance generation is reduced. Third, we propose a bitmap based candidate pruning technique (BitmapPruner) that speedups co-location pattern mining. We provide an approximate prevalence measurement and define approximate co-location patterns. We propose a bitmap structure to quickly discover approximate patterns. If a candidate is not an approximate pattern, we prune it immediately without entering the costly instance generation step. Due to the lightweighted and effective pruning technique, we improve the efficiency of existing co-location pattern discovery approaches. Date: Friday, 17 April 2009 Time: 10:30a.m.-12:30p.m. Venue: Room 4480 lifts 25-26 Committee Members: Dr. Qiong Luo (Supervisor) Dr. Wei-Ying Ma (Supervisor, Microsoft Research) Prof. Dik-Lun Lee (Chairperson) Prof. Frederick Lochovsky Dr. Wilfred Ng Dr. Xing Xie (Microsoft Research) **** ALL are Welcome ****