New Clustering Approaches for Mining Salient Patterns in High Dimensional Data

Speaker:	Dr. Wei WANG
		Department of Computer Science
		University of North Carolina at Chapel Hill
		USA

Title:		"New Clustering Approaches for Mining Salient Patterns
		 in High Dimensional Data"

Date:		Thursday, 12 November 2009

Time:		11:00am - 12 noon

Venue:		Room 2404 (via lifts 17/18), HKUST


Abstract:

The advances of new technologies have made data collection easier and
faster, resulting in large and complex datasets consisting of hundreds of
thousands of objects with hundreds of dimensions. Scalable and efficient
unsupervised clustering methods have been the most popular approaches in
analyzing these large datasets. Traditional clustering approaches
typically partition objects into disjoint groups based on distances in
full dimensional space. However, more often than not, some dimensions of
high dimensional data may be irrelevant to a cluster and can mask the
cluster's existence. This phenomenon, called the curse of dimensionality,
prevents salient structures from being discovered by traditional
clustering approaches. We developed unsupervised clustering approaches to
capture pattern-preserving clusters in the subspaces of high dimensional
space. The proposed subspace clustering algorithms tackle the curse of
dimensionality by localizing the search of clusters in the subspaces of
the original high dimensional data. They go beyond the existing
distance-based clustering criteria by revealing consistent patterns that
can be far apart in distance.


*******************
Biography:

Wei Wang is an associate professor in the Department of Computer Science
and a member of the Carolina Center for Genome Sciences at the University
of North Carolina at Chapel Hill. Dr. Wang's research interests include
data mining, bioinformatics, and databases. She has filed seven patents,
and has published one monograph and more than one hundred research papers
in international journals and major peer-reviewed conference proceedings.
Dr. Wang received the IBM Invention Achievement Awards in 2000 and 2001.
She was the recipient of a UNC Junior Faculty Development Award in 2003
and an NSF Faculty Early Career Development (CAREER) Award in 2005. She
was named a Microsoft Research New Faculty Fellow in 2005. She was honored
with the 2007 Phillip and Ruth Hettleman Prize for Artistic and Scholarly
Achievement at UNC. Dr. Wang is an associate editor of the IEEE
Transactions on Knowledge and Data Engineering and ACM Transactions on
Knowledge Discovery in Data, and an editorial board member of the
International Journal of Data Mining and Bioinformatics. She serves as a
program committee co-chair of IEEE ICDM 2009 and has served on the program
committees of prestigious international conferences such as ACM SIGMOD,
ACM SIGKDD, VLDB, ICDE, EDBT, ACM CIKM, IEEE ICDM, and SSDBM.