Efficient Aggregation under Differential Privacy

The Hong Kong University of Science and Technology
Department of Computer Science and Engineering


PhD Thesis Defence


Title: "Efficient Aggregation under Differential Privacy"

By

Miss Juanru FANG


Abstract:

Differential privacy (DP) has become the mainstream privacy standard due to 
its strong protection of individual users' information. It guarantees that 
for any neighboring datasets, the output of the mechanism is practically 
identical. There are two common definitions of neighborhoods in differential 
privacy. The first is record-level DP, where neighboring datasets differ by 
exactly one record; the second is user-level DP, where two datasets are 
neighbors if they differ by a single user's data. Some commonly used 
aggregation functions include SUM, MAX, AVERAGE, etc.

In our first work, we note that many frequently encountered aggregation 
functions are monotonic or can be derived from monotonic functions. Based on 
this insight, we design a general user-level DP mechanism that estimates 
monotonic functions with strong optimality guarantees. While our general 
mechanism may run in super-polynomial time, we show how to instantiate an 
approximate version that runs in polynomial time, making it efficient for 
certain common monotonic functions.

From our first work, we observe that existing user-level DP mechanisms still 
have high computational costs due to the complex relationships between users 
and records. To address this challenge, random sampling has proven to be an 
effective approach for reducing computational costs. However, all existing 
sampling-based mechanisms are designed for record-level DP. Therefore, in 
our second work, we take the first step towards investigating the 
application of random sampling under user-level DP and propose an accurate 
and efficient sampling-based mechanism for the sum estimation function.

A limitation of the first two works is that they focus on the central DP 
model, where all users trust the data curator. However, this assumption is 
often impractical. Thus, in our third work, we shift to the distributed 
setting and focus on the shuffle DP model. We explore the subgraph counting 
problem under edge DP, which is a specific case of aggregation function 
estimation in the context of record-level DP. We propose a general 
sampling-based mechanism, apply it to several common subgraphs, and analyze 
the performance guarantees.


Date:                   Thursday, 29 May 2025

Time:                   3:00pm - 5:00pm

Venue:                  Room 2128A
                        Lift 19

Chairman:               Dr. Zhihong GUO (CHEM)

Committee Members:      Prof. Ke YI (Supervisor)
                        Prof. Qiong LUO
                        Dr. Dimitris PAPADOPOULOS
                        Prof. Yuan YAO (MATH)
                        Dr. Hubert CHAN (HKU)