More about HKUST
Transfer Learning via Dimensionality Reduction
PhD Thesis Proposal Defence Title: "Transfer Learning via Dimensionality Reduction" by Mr. Jialin Pan Abstract: A major assumption in many machine learning and data mining algorithms is that the training and future data must be in the same feature space and have the same distribution. However, in many real-world applications, this assumption may not hold. For example, we sometimes have a classification task in one domain of interest, but we only have sufficient training data in another domain, where the latter data may be in a different feature space or follow a different data distribution. In such cases, knowledge transfer, if done successfully, would greatly improve the performance of learning by avoiding much expensive data labeling efforts. In recent years, transfer learning has emerged as a new learning setting to address this problem. In this proposal, we propose a novel dimensionality reduction framework for transfer learning. Our dimensionality reduction framework tries to learn a subspace across domains in a Reproducing Kernel Hilbert Space (RKHS) using Maximum Mean Discrepancy (MMD). In the subspace, distance between data distributions can be close to each other. As a result, with the new representations in this subspace, we can apply standard machine learning methods to train classifiers or regression models in the source domain for use in the target domain. Based on the framework, we first present two unsupervised dimensionality reduction methods, Maximum Mean Discrepancy Embedding (MMDE) and Transfer Component Analysis (TCA), for transfer learning. The effectiveness of MMDE and TCA has been verified by experiments on two real-world applications: cross-domain indoor WiFi localization and cross-domain text classification. We further propose a semi-supervised extension of TCA (SSTCA), which not only reduces distance between domains but also maximizes dependence between features and labels when learning the subspace in a RKHS based on an integration of the MMD and Hilbert-Schmidt Independence Criterion (HSIC) techniques. Preliminary experiments show that the proposed SSTCA method for transfer learning is promising. In the future, we plan to further test the effectiveness of SSTCA in more applications and study the proposed three dimensionality reduction methods for transfer learning theoretically. Date: Thursday, 22 April 2010 Time: 10:00am - 12:00noon Venue: Room 3408 lifts 17/18 Committee Members: Prof. Qiang Yang (Supervisor) Dr. Brian Mak (Chairperson) Dr. Raymond Wong Prof. Dit-Yan Yeung **** ALL are Welcome ****