Transfer Learning via Dimensionality Reduction

PhD Thesis Proposal Defence


Title: "Transfer Learning via Dimensionality Reduction"

by

Mr. Jialin Pan


Abstract:

A major assumption in many machine learning and data mining algorithms is
that the training and future data must be in the same feature space and
have the same distribution. However, in many real-world applications, this
assumption may not hold. For example, we sometimes have a classification
task in one domain of interest, but we only have sufficient training data
in another domain, where the latter data may be in a different feature
space or follow a different data distribution. In such cases, knowledge
transfer, if done successfully, would greatly improve the performance of
learning by avoiding much expensive data labeling efforts. In recent
years, transfer learning has emerged as a new learning setting to address
this problem. In this proposal, we propose a novel dimensionality
reduction framework for transfer learning. Our dimensionality reduction
framework tries to learn a subspace across domains in a Reproducing Kernel
Hilbert Space (RKHS) using Maximum Mean Discrepancy (MMD). In the
subspace, distance between data distributions can be close to each other.
As a result, with the new representations in this subspace, we can apply
standard machine learning methods to train classifiers or regression
models in the source domain for use in the target domain. Based on the
framework, we first present two unsupervised dimensionality reduction
methods,  Maximum Mean Discrepancy Embedding (MMDE) and Transfer Component
Analysis (TCA), for transfer learning. The effectiveness of MMDE and TCA
has been verified by experiments on two real-world applications:
cross-domain indoor WiFi localization and cross-domain text
classification. We further propose a semi-supervised extension of TCA
(SSTCA), which not only reduces distance between domains but also
maximizes dependence between features and labels when learning the
subspace in a RKHS based on an integration of the MMD and Hilbert-Schmidt
Independence Criterion (HSIC) techniques. Preliminary experiments show
that the proposed SSTCA method for transfer learning is promising. In the
future, we plan to further test the effectiveness of SSTCA in more
applications and study the proposed three dimensionality reduction methods
for transfer learning theoretically.


Date:  			Thursday, 22 April 2010

Time:           	10:00am - 12:00noon

Venue:          	Room 3408
 			lifts 17/18

Committee Members:      Prof. Qiang Yang (Supervisor)
 			Dr. Brian Mak (Chairperson)
 			Dr. Raymond Wong
 			Prof. Dit-Yan Yeung


**** ALL are Welcome ****