More about HKUST
Secure and Practical Federated Matrix Factorization
The Hong Kong University of Science and Technology
Department of Computer Science and Engineering
PhD Thesis Defence
Title: "Secure and Practical Federated Matrix Factorization"
By
Mr. Di CHAI
Abstract:
Matrix factorization (MF) is an essential primitive to support various
applications including recommender systems, genetic studies, topic modeling,
financial applications, etc. While most MF-based applications deal with
large-scale sensitive data, e.g., users' shopping history in the recommender
system and genome data in genetic studies, they encounter the
isolated-data-islands problem. Restricted by the privacy-preserving
regulations, it is challenging to share the data across different parties.
Consequently, the data are in the form of isolated islands and conventional MF
that requires data collected centrally cannot work. To solve this problem,
federated MF has been proposed to decompose data distributed among different
parties. While pioneered studies have shown the feasibility of federated MF,
they are limited in terms of privacy, utility, or efficiency.
In this thesis, we first define the problem of federated MF and its two
subproblems: federated sparse MF and federated dense MF. We then present the
targets achieved by existing works and propose three new research questions. In
the first question, we focus on the MF-based recommender system that deals with
sparse matrices. Existing works exchange gradients in plaintext, raising
significant privacy concerns. Our first contribution is mathematically proving
that gradients leak raw data, leading us to protect the gradients using
homomorphic encryption. However, even with the gradient values secured, the
indexes of the gradients can still leak information (e.g., used for inference
attacks) due to the sparsity of the rating matrices. To address this, we
propose an obfuscation-based method to defend against inference attacks with
bounded utility loss and efficiency overhead. In the second question, we focus
on federated dense MF which supports applications such as genetic studies,
topic modeling, and financial applications. These applications handle
large-scale dense matrices and require high accuracy. Existing works either
suffer from utility penalties because of leveraging differential privacy or
face severe efficiency issues due to leveraging homomorphic encryption. We
propose a practical lossless federated singular vector decomposition (SVD)
system that is capable of decomposing billion-scale data. Specifically, we
propose lossless masking-based protection tailored for federated SVD and
improve the efficiency through extensive system optimizations. The evaluation
results demonstrate that our proposed solutions outperform existing studies and
significantly improve the practicality of MF-based real-world applications. In
the third question, we concentrate on decentralized federated MF. Most of the
existing solutions rely on third-party servers, which compromises the system's
security. Although some works have tried to remove the external servers using
HE, they suffer from severe efficiency issues. We remove the external servers
using a novel lightweight matrix protection and achieve high efficiency through
vest computational and communication optimizations.
Date: Friday, 17 January 2025
Time: 12:30pm - 2:30pm
Venue: Room 3494
Lifts 25/26
Chairman: Dr. Shiheng WANG (ACCT)
Committee Members: Prof. Qiang YANG (Supervisor)
Prof. Kai CHEN (Supervisor)
Dr. Qifeng CHEN
Prof. Ke YI
Dr. Can YANG (MATH)
Prof. Chuan WU (HKU)