More about HKUST
FEDERATED TRANSFER LEARNING UNDER HETEROGENEOUS DATA
The Hong Kong University of Science and Technology Department of Computer Science and Engineering PhD Thesis Defence Title: "FEDERATED TRANSFER LEARNING UNDER HETEROGENEOUS DATA" By Mr. Xueyang WU Abstract Recent advancements in artificial intelligence (AI) applications rely on massive amounts oftraining data. In practice, these valuable data are independently distributed among multiple dataowners (e.g., companies and individuals), whose quantities are typically modest, and the data areusually heterogeneous. Collecting data from individual users or acquiring data from data owners isa conventionally popular and straightforward solution to this issue. However, such solutions havebecome obsolete due to the rising trend of data privacy and data security concerns. Currently, AIsystems face the problem of utilizing fragmented and diverse data that are independently distributedacross several data owners. Federated learning (FL), a novel privacy-preserving collaborative machine learning paradigm,is proposed to address the privately isolated small data learning problem. Its main idea is to com-pose a federation of data owners in which all participants virtually assemble their data withoutsacrificing data security and privacy. There are several challenges for federated learning, includingcommunication efficiency, data security and privacy protection, and statistical learning. Among these challenges, the statistical learning challenge caused by heterogeneous data significantly af-fects the performance of FL systems and thus prohibits FL’s applications in practice. In recentyears, academics have developed a machine learning paradigm known as transfer learning, whichutilizes heterogeneous data to solve the statistical learning issue in the target domain with limitedor no data. Naturally, it motivates us to incorporate the spirit of transfer learning into federatedlearning to overcome the difficulty of statistical learning in practical FL. In this thesis, we focus on federated transfer learning, a class of federated learning methodsthat employ the transfer learning methodology to tackle the statistical learning difficulty posed byheterogeneous data. Compared to other federated learning approaches, which presume datasets ondata owners are similarly and independently distributed, federated transfer learning focuses on howto address data heterogeneity across data owners in practice and achieves superior performance. The thesis consists of two parts. First, we provide a brief overview of federated learning, includ-ing its concept, evolution, and categorization. More specifically, we cover its statistical learningchallenges in depth. We offer a precise categorization of algorithms addressing these challengesin federated learning, which we refer to as federated transfer learning. Then, we examine currentrepresentative works and incorporate them into our proposed federated transfer learning architec-ture. Second, we identify three typical scenarios of data heterogeneity in federated learning withpractical applications and investigate how our proposed federated transfer learning methods over-come the challenge in these scenarios. We believe that these federated transfer learning methodshold great promise for wider applications of federated learning. Date: Monday, 12 December 2022 Time: 10:30am - 12:30pm Venue: Room 3494 lifts 25/26 Chairperson: Prof. Lixin XU (MATH) Committee Members: Prof. Qiang YANG (Supervisor) Prof. Lei CHEN (Supervisor) Prof. Kai CHEN Prof. Yangqiu SONG Prof. Can YANG (MATH) Prof. Qing LI (PolyU) **** ALL are Welcome ****