More about HKUST
TRANSFER LEARNING WITH OPEN WEB DATA
The Hong Kong University of Science and Technology Department of Computer Science and Engineering PhD Thesis Defence Title: "TRANSFER LEARNING WITH OPEN WEB DATA" By Mr. Wei XIANG Abstract In recent years, transfer learning has been applied to a variety of real-world application domains, ranging from text classification, image classification, link prediction, activity recognition, to social network analysis. Transfer learning is particularly useful when we only have limited labeled data in a target domain, which requires that we consult one or more auxiliary or source domains to gain insight on how to solve the target problem. Thus, the key point for successful knowledge transfer is that one or more “right” source data should be given by the problem designer at the learning time. However, it is very difficult to identify a proper set of source data. An intuitive idea is whether we can directly seek the needed source data from the open Web. In this thesis, we try to study how to extend the existing transfer learning techniques to cope with the need for transfer learning from the massive and noisy Web data. We focus on tackling the following four research issues: (1) Transfer over information gap; (2) Transfer from heterogeneous data; (3) Transfer with partially labeled correspondence; (4) Selective transfer from massive and noisy sources. For each of the above mentioned issues, we first conduct extensive study on the difficulty of the problems, and then propose a series of effective solutions accordingly. Moreover, to cope with the need for manipulating the massive Web data as the source, we also investigate how to make our transfer learning models to be scalable with the assist of distributed computing techniques. We apply these methods to two diverse applications: text classification and link prediction, and achieve promising results. Experimental results show that our methods can successfully benefit from the truly useful information contained in the Web, while reducing the risks caused by massive and noisy property of the open Web to the minimum. Date: Tuesday, 29 May 2012 Time: 2:00pm – 4:00pm Venue: Room 3501 Lifts 25/26 Chairman: Prof. Kun Xu (MATH) Committee Members: Prof. Qiang Yang (Supervisor) Prof. Shing-Chi Cheung Prof. Raymond Wong Prof. Rong Zheng (ISOM) Prof. Haifeng Wang (Habin Inst. of Tech.) **** ALL are Welcome ****