More about HKUST
Distant Domain Transfer Learning
PhD Thesis Proposal Defence Title: "Distant Domain Transfer Learning" by Mr. Ben TAN Abstract: Transfer learning adapts and reuses knowledge from source domains for a target domain. It has attained much popularity in data mining and machine learning, as well as many other areas. A major assumption in many transfer learning algorithms is that the source and target domains should be closely related. This relation can be in the form of related instances, features or models, and measured by the KL-divergence or A-distance. However, if two domains are not directly related, performing knowledge transfer between these domains will not be effective. This source-target domain gap is a serious impediment to the successful application of transfer learning. In this proposal, we study a novel learning problem: Distant Domain Transfer Learning (abbreviated to DDTL). In DDTL, we aim to break the large domain gaps and transfer knowledge even if the source and target domains share few factors directly. For example, the source domain contains plenty of labeled text documents but the target domain is composed of image data, they have completely different feature spaces; or the source domain classifies face images but the target domain distinguishes plane images, they do not share any common characteristic in shape or other aspects, they are conceptually distant. The DDTL problem is critical and important as solving it can largely expand the application scope of transfer learning and help reuse as much previous knowledge as possible. Nonetheless, this is a difficult problem as the distribution gap between the source domain and the target domain is large. Inspired by human transitive inference and learning ability, whereby two seemingly unrelated concepts can be connected by a string of intermediate bridges using auxiliary concepts, in this proposal we study a novel learning framework: transitive transfer learning (abbreviated to TTL). The main idea of TTL is to transfer knowledge between distant domains by using some auxiliary intermediate data as a bridge. The distant domains can have heterogeneous feature spaces or homogeneous feature spaces but distant characteristics, and they can be connected by one or multiple intermediate domains. We propose several algorithms to tackle the DDTL problem with different problem settings, and will verify the proposed algorithms on some real world data sets. Date: Monday, 5 December 2016 Time: 3:40pm - 5:40pm Venue: Room 3501 (lifts 25/26) Committee Members: Prof. Qiang Yang (Supervisor) Dr. Yangqiu Song (Chairperson) Prof. Long Quan Prof. Nevin Zhang **** ALL are Welcome ****