Kernel Based Clustering and Low Rank Approximation

The Hong Kong University of Science and Technology
Department of Computer Science and Engineering


PhD Thesis Defence


Title: "Kernel Based Clustering and Low Rank Approximation"

By

Mr. Kai Zhang


Abstract

Clustering is an unsupervised data exploration scenario that is of
fundamental importance to pattern recognition and machine learning. This
thesis involves two types of lustering paradigms, the mixture models and
graph-based clustering methods, with the primary focus on how to improve
the scaling behavior of related algorithms for large-scale application.
With regard to mixture models, we are interested in reducing the model
complexity in terms of number of components. We propose a unified
algorithm to simultaneously solve "model simplification" and "component
clustering", and apply it with success in a number of learning algorithms
using mixture models, such as density based clustering and and SVM
testing. For graph-based clustering, we propose the density weighted
Nystrom method for solving large scale eigenvalue problems, which
demonstrates encouraging performance in the normalized-cut and kernel
principal component analysis. We further extend this to the low rank
approximation of kernel matrices, the key component to scaling up the
kernel machines. We provide an error analysis on the Nystrom low rank
approximation, based on which a new sampling scheme is proposed for it.
Our scheme is very efficient and numerically outperforms a number of
state-of-the-art approaches such as incomplete Cholesky decomposition, the
standard Nystrom method, and probabilistic sampling approaches.


Date:			Wednesday, 30 July 2008

Time:			2:00p.m.-4:00p.m.

Venue:			Room 3401
			Lifts 17-18

Chairman:		Prof. Roger Cheng (ECE)

Committee Members:	Prof. James Kwok (Supervisor)
			Prof. Long Quan
			Prof. Dit-Yan Yeung
			Prof. Chris Ding (Comp. Sci. & Engg., Univ. of Texas)
			Prof. Irwin King (Comp. Sci. & Engg., CUHK)


**** ALL are Welcome ****