More about HKUST
FEDERATED TRANSFER LEARNING FOR HETEROGENEOUS DATA
PhD Thesis Proposal Defence Title: "FEDERATED TRANSFER LEARNING FOR HETEROGENEOUS DATA" by Mr. Xueyang WU Abstract: Recent advancements in artificial intelligence (AI) applications rely on massive amounts of training data. In practice, these valuable data are independently distributed among multiple data owners (e.g., companies and individuals), whose quantities are typically modest, and the data are usually heterogeneous. Collecting data from individual users or acquiring data from data owners is a conventionally popular and straightforward solution to this issue. However, such solutions have become obsolete due to the rising trend of data privacy and data security concerns. Currently, AI systems face the problem of utilizing fragmented and diverse data that are independently distributed across several data owners. Federated learning (FL), a novel privacy-preserving collaborative machine learning paradigm, is proposed to address the privately isolated small data learning problem. Its main idea is to compose a federation of data owners in which all participants virtually assemble their data without sacrificing data security and privacy. There are several challenges for federated learning, including communication efficiency, data security and privacy protection, and statistical learning. Among these challenges, the statistical learning challenge caused by heterogeneous data significantly affects the performance of FL systems and thus prohibits FL’s applications in practice. In recent years, academics have developed a machine learning paradigm known as transfer learning, which utilizes heterogeneous data to solve the statistical learning issue in the target domain with limited or no data. Naturally, it motivates us to incorporate the spirit of transfer learning into federated learning to overcome the difficulty of statistical learning in practical FL. In this proposal, we focus on federated transfer learning, a class of federated learning methods that employ the transfer learning methodology to tackle the statistical learning difficulty posed by heterogeneous data. Compared to other federated learning approaches, which presume datasets on data owners are similarly and independently distributed, federated transfer learning focuses on how to address data heterogeneity across data owners in practice and achieves superior performance. The proposal consists of two parts. First, we provide a brief overview of federated learning, including its concept, evolution, and categorization. More specifically, we cover its statistical learning challenges in depth. We offer a precise categorization of algorithms addressing these challenges in federated learning, which we refer to as federated transfer learning. Then, we examine current representative works and incorporate them into our proposed federated transfer learning architecture. Second, we identify three typical scenarios of data heterogeneity in federated learning with practical applications and investigate how our proposed federated transfer learning methods overcome the challenge in these scenarios. Date: Tuesday, 30 August 2022 Time: 12:00noon - 2:00pm Zoom Meeting: https://hkust.zoom.us/j/96746404649?pwd=bTVKT0lBNnZHUUJMQXowclk1MTFnUT09 Committee Members: Prof. Qiang Yang (Supervisor) Prof. Lei Chen (Supervisor) Prof. Kai Chen (Chairperson) Dr. Qian Xu (AI Thrust) **** ALL are Welcome ****