Distant Domain Transfer Learning

PhD Thesis Proposal Defence


Title: "Distant Domain Transfer Learning"

by

Mr. Ben TAN


Abstract:

Transfer learning adapts and reuses knowledge from source domains for a target 
domain. It has attained much popularity in data mining and machine learning, as 
well as many other areas. A major assumption in many transfer learning 
algorithms is that the source and target domains should be closely related. 
This relation can be in the form of related instances, features or models, and 
measured by the KL-divergence or A-distance. However, if two domains are not 
directly related, performing knowledge transfer between these domains will not 
be effective. This source-target domain gap is a serious impediment to the 
successful application of transfer learning.

In this proposal, we study a novel learning problem: Distant Domain Transfer 
Learning (abbreviated to DDTL). In DDTL, we aim to break the large domain gaps 
and transfer knowledge even if the source and target domains share few factors 
directly. For example, the source domain contains plenty of labeled text 
documents but the target domain is composed of image data, they have completely 
different feature spaces; or the source domain classifies face images but the 
target domain distinguishes plane images, they do not share any common 
characteristic in shape or other aspects, they are conceptually distant. The 
DDTL problem is critical and important as solving it can largely expand the 
application scope of transfer learning and help reuse as much previous 
knowledge as possible. Nonetheless, this is a difficult problem as the 
distribution gap between the source domain and the target domain is large.

Inspired by human transitive inference and learning ability, whereby two 
seemingly unrelated concepts can be connected by a string of intermediate 
bridges using auxiliary concepts, in this proposal we study a novel learning 
framework: transitive transfer learning (abbreviated to TTL). The main idea of 
TTL is to transfer knowledge between distant domains by using some auxiliary 
intermediate data as a bridge. The distant domains can have heterogeneous 
feature spaces or homogeneous feature spaces but distant characteristics, and 
they can be connected by one or multiple intermediate domains. We propose 
several algorithms to tackle the DDTL problem with different problem settings, 
and will verify the proposed algorithms on some real world data sets.


Date:			Monday, 5 December 2016

Time:                  	3:40pm - 5:40pm

Venue:                  Room 3501
                         (lifts 25/26)

Committee Members:	Prof. Qiang Yang (Supervisor)
  			Dr. Yangqiu Song (Chairperson)
 			Prof. Long Quan
  			Prof. Nevin Zhang


**** ALL are Welcome ****