Speaker: Anthony Tung, Simon Fraser

Title: Clustering in the Presence of Obstacles

Date: Thursday, 29 March 2001

Time: 2:00pm - 3:00pm

Venue: Room 2404, HKUST

Abstract:
Cluster analysis, which groups data for finding overall distribution, patterns and interesting correlations among data sets, has numerous applications in pattern recognition, spatial data analysis, image processing, market research, etc.. While much emphasis has been placed on improving the efficiency and effectiveness of clustering, few work has been done on incorporating user's domain constraints into clustering.

In this talk, we will look at the problem of clustering under physical constraints. Our problem instance consists of a two dimensional region with a set of points to be clustered and a set of polygons which represent physical obstacles like lakes, rivers and highways. Closeness between two points are measured by their obstructed distance i.e. distance of the shortest path between them such that all obstacles are avoided. As clustering is performed using the obstructed distance, we called this problem a COD (Clustering with Obstructed Distance) problem.

As a solution to this problem, we propose a scalable clustering algorithm, called COD-CLARANS. We will look at various forms of pre-processed information that could enhance the efficiency of COD-CLARANS. In the strictest sense, the COD problem can be treated as a change in distance function and thus could be handled by existing clustering algorithms by modifying the distance function. However, we show that by pushing the task of handling obstacles into COD-CLARANS instead of abstracting it at the distance function level, more optimization can be achieved in the form of a pruning function. We will look at various performance studies to show that COD-CLARANS is both efficient and effective.

If time allows, other research of the speaker will also be introduced.

Biography:
Anthony K. H. Tung was a student of the National University of Singapore obtaining his B.Sc.(Computer Science) in 1997 and his M.Sc. in 1998 under the accelerated master programme. During that period, his work focused on the application of data mining to online problems like ships berthing and database buffer management. Currently, Anthony is a PhD. candidate in Simon Fraser University, Canada. His research interest is in the field of data mining and its applications in geographical data analysis, collaborative filtering, and DNA analysis.