Speaker: Ms. Cui YU, National University of Singapore

Title: High-dimensional Indexing

Date: Monday, 26 February 2001

Time: 4:00pm - 5:00pm

Venue: Lecture Theater F (Leung Yat Sing Lecture Theater), Academic Concourse (near lift nos. 25/26), HKUST

Abstract:
High-dimensional indexing has been considered as an important means to facilitate fast query processing in data mining, fast retrieval and similarity search in image and scientific databases. They are required to support high-dimensional window/range queries, similarity range and K-nearest neighbor queries. In this talk, I shall present two transformation-based indexing schemes, iMinMax(theta) and iDistance. The iMinMax(theta) method maps points in high-dimensional to single-dimensional values determined by their maximum or minimum values among all dimensions. With such representation, we are able to index high-dimensional data points using a classical B+-tree. By varying the tuning `knob', theta, we can obtain different family of iMinMax structures that are optimized for different distributions of data sets. The metric-based iDistance method is designed to efficiently support similarity or K-nearest neighbor (KNN) queries. In the iDistance, the high-dimensional space is split into partitions, and each partition is associated with an `anchor' point (called reference point) whereby other points in the same partitions can be made reference to. With such transformation, the transformed points can be indexed using a B+-tree, and similarity search in the high-dimensional space is performed as a sequence of increasingly larger range queries on the single dimensional space. We conducted extensive experiments using different data sets, and in this talk, I shall present the results. The major strengths of both indexes are their simplicity, efficiency, and the fact that they can be integrated into existing systems effectively. In fact, both methods are being used in a commercial image system.

Biography:
Ms Cui YU obtained her B. Eng from Nanjing University of Aeronautics and Astronautics in 1995. She did her PhD at National University of Singapore, and has recently submitted her thesis. Her research interests include indexing, clustering techniques and query processing. She has two patents pending, and has participated in a startup called GeoFoto.