More about HKUST
TRANSFER LEARNING WITH OPEN WEB DATA
The Hong Kong University of Science and Technology
Department of Computer Science and Engineering
PhD Thesis Defence
Title: "TRANSFER LEARNING WITH OPEN WEB DATA"
By
Mr. Wei XIANG
Abstract
In recent years, transfer learning has been applied to a variety of real-world
application domains, ranging from text classification, image classification,
link prediction, activity recognition, to social network analysis. Transfer
learning is particularly useful when we only have limited labeled data in a
target domain, which requires that we consult one or more auxiliary or source
domains to gain insight on how to solve the target problem. Thus, the key point
for successful knowledge transfer is that one or more “right” source data
should be given by the problem designer at the learning time. However, it is
very difficult to identify a proper set of source data. An intuitive idea is
whether we can directly seek the needed source data from the open Web. In this
thesis, we try to study how to extend the existing transfer learning techniques
to cope with the need for transfer learning from the massive and noisy Web
data. We focus on tackling the following four research issues: (1) Transfer
over information gap; (2) Transfer from heterogeneous data; (3) Transfer with
partially labeled correspondence; (4) Selective transfer from massive and noisy
sources. For each of the above mentioned issues, we first conduct extensive
study on the difficulty of the problems, and then propose a series of effective
solutions accordingly. Moreover, to cope with the need for manipulating the
massive Web data as the source, we also investigate how to make our transfer
learning models to be scalable with the assist of distributed computing
techniques. We apply these methods to two diverse applications: text
classification and link prediction, and achieve promising results. Experimental
results show that our methods can successfully benefit from the truly useful
information contained in the Web, while reducing the risks caused by massive
and noisy property of the open Web to the minimum.
Date: Tuesday, 29 May 2012
Time: 2:00pm – 4:00pm
Venue: Room 3501
Lifts 25/26
Chairman: Prof. Kun Xu (MATH)
Committee Members: Prof. Qiang Yang (Supervisor)
Prof. Shing-Chi Cheung
Prof. Raymond Wong
Prof. Rong Zheng (ISOM)
Prof. Haifeng Wang (Habin Inst. of Tech.)
**** ALL are Welcome ****