More about HKUST
Distant Domain Transfer Learning
PhD Thesis Proposal Defence
Title: "Distant Domain Transfer Learning"
by
Mr. Ben TAN
Abstract:
Transfer learning adapts and reuses knowledge from source domains for a target
domain. It has attained much popularity in data mining and machine learning, as
well as many other areas. A major assumption in many transfer learning
algorithms is that the source and target domains should be closely related.
This relation can be in the form of related instances, features or models, and
measured by the KL-divergence or A-distance. However, if two domains are not
directly related, performing knowledge transfer between these domains will not
be effective. This source-target domain gap is a serious impediment to the
successful application of transfer learning.
In this proposal, we study a novel learning problem: Distant Domain Transfer
Learning (abbreviated to DDTL). In DDTL, we aim to break the large domain gaps
and transfer knowledge even if the source and target domains share few factors
directly. For example, the source domain contains plenty of labeled text
documents but the target domain is composed of image data, they have completely
different feature spaces; or the source domain classifies face images but the
target domain distinguishes plane images, they do not share any common
characteristic in shape or other aspects, they are conceptually distant. The
DDTL problem is critical and important as solving it can largely expand the
application scope of transfer learning and help reuse as much previous
knowledge as possible. Nonetheless, this is a difficult problem as the
distribution gap between the source domain and the target domain is large.
Inspired by human transitive inference and learning ability, whereby two
seemingly unrelated concepts can be connected by a string of intermediate
bridges using auxiliary concepts, in this proposal we study a novel learning
framework: transitive transfer learning (abbreviated to TTL). The main idea of
TTL is to transfer knowledge between distant domains by using some auxiliary
intermediate data as a bridge. The distant domains can have heterogeneous
feature spaces or homogeneous feature spaces but distant characteristics, and
they can be connected by one or multiple intermediate domains. We propose
several algorithms to tackle the DDTL problem with different problem settings,
and will verify the proposed algorithms on some real world data sets.
Date: Monday, 5 December 2016
Time: 3:40pm - 5:40pm
Venue: Room 3501
(lifts 25/26)
Committee Members: Prof. Qiang Yang (Supervisor)
Dr. Yangqiu Song (Chairperson)
Prof. Long Quan
Prof. Nevin Zhang
**** ALL are Welcome ****