FEDERATED TRANSFER LEARNING FOR HETEROGENEOUS DATA

PhD Thesis Proposal Defence


Title: "FEDERATED TRANSFER LEARNING FOR HETEROGENEOUS DATA"

by

Mr. Xueyang WU


Abstract:

Recent advancements in artificial intelligence (AI) applications rely on 
massive amounts of training data. In practice, these valuable data are 
independently distributed among multiple data owners (e.g., companies and 
individuals), whose quantities are typically modest, and the data are usually 
heterogeneous. Collecting data from individual users or acquiring data from 
data owners is a conventionally popular and straightforward solution to this 
issue. However, such solutions have become obsolete due to the rising trend of 
data privacy and data security concerns. Currently, AI systems face the problem 
of utilizing fragmented and diverse data that are independently distributed 
across several data owners.

Federated learning (FL), a novel privacy-preserving collaborative machine 
learning paradigm, is proposed to address the privately isolated small data 
learning problem. Its main idea is to compose a federation of data owners in 
which all participants virtually assemble their data without sacrificing data 
security and privacy. There are several challenges for federated learning, 
including communication efficiency, data security and privacy protection, and 
statistical learning. Among these challenges, the statistical learning 
challenge caused by heterogeneous data significantly affects the performance of 
FL systems and thus prohibits FL’s applications in practice. In recent years, 
academics have developed a machine learning paradigm known as transfer 
learning, which utilizes heterogeneous data to solve the statistical learning 
issue in the target domain with limited or no data. Naturally, it motivates us 
to incorporate the spirit of transfer learning into federated learning to 
overcome the difficulty of statistical learning in practical FL.

In this proposal, we focus on federated transfer learning, a class of federated 
learning methods that employ the transfer learning methodology to tackle the 
statistical learning difficulty posed by heterogeneous data. Compared to other 
federated learning approaches, which presume datasets on data owners are 
similarly and independently distributed, federated transfer learning focuses on 
how to address data heterogeneity across data owners in practice and achieves 
superior performance.

The proposal consists of two parts. First, we provide a brief overview of 
federated learning, including its concept, evolution, and categorization. More 
specifically, we cover its statistical learning challenges in depth. We offer a 
precise categorization of algorithms addressing these challenges in federated 
learning, which we refer to as federated transfer learning. Then, we examine 
current representative works and incorporate them into our proposed federated 
transfer learning architecture. Second, we identify three typical scenarios of 
data heterogeneity in federated learning with practical applications and 
investigate how our proposed federated transfer learning methods overcome the 
challenge in these scenarios.


Date:  			Tuesday, 30 August 2022

Time:                  	12:00noon - 2:00pm

Zoom Meeting:
https://hkust.zoom.us/j/96746404649?pwd=bTVKT0lBNnZHUUJMQXowclk1MTFnUT09

Committee Members:	Prof. Qiang Yang (Supervisor)
 			Prof. Lei Chen (Supervisor)
 			Prof. Kai Chen (Chairperson)
 			Dr. Qian Xu (AI Thrust)


**** ALL are Welcome ****