Estimating Aggregation Functions under User-level Differential Privacy

PhD Thesis Proposal Defence


Title: "Estimating Aggregation Functions under User-level Differential Privacy"

by

Miss Juanru FANG


Abstract:

Differential privacy (DP) has become the mainstream privacy standard due to its 
strong protection of individual users' information. It guarantees that for any 
neighboring datasets that differ by one user's data, the output of the 
mechanism is practically identical. Most research has focused on record-level 
DP where each user possesses exactly one record in the dataset, so that 
neighboring datasets differ by exactly one record. Recently, user-level DP has 
gained lots of attention as it does not set limitation on the number of records 
each user contributes and protects all records contributed by any particular 
user, thus offering stronger privacy protection.

In user-level DP, a key challenge is to compute aggregation functions. One of 
the fundamental aggregation functions studied by most existing work is the sum 
estimation function. Additionally, other commonly used aggregation functions 
include MAX, AVERAGE, and etc.

In our first work, we note that many frequently encountered aggregation 
functions are monotonic or can be derived from monotonic functions. Based on 
this insight, we design a general user-level DP mechanism that estimates 
monotonic functions with strong optimality guarantees. While our general 
mechanism may run in super-polynomial time, we demonstrate how to instantiate 
an approximate version that runs in polynomial time for some common monotonic 
functions, including sum, k-selection, maximum frequency, and distinct count.

From our first work, we observe that existing user-level DP mechanisms have 
high computational costs due to the complex relationships between users and 
records. To address this challenge, random sampling has proven to be an 
effective approach for reducing computational costs. However, all existing 
sampling-based mechanisms are designed for record-level DP. Therefore, in our 
second work, we take the first step towards investigating the application of 
random sampling under user-level differential privacy and propose an accurate 
and efficient sampling-based mechanism for the sum estimation function.


Date:                   Wednesday, 29 May 2024

Time:                   10:00am - 12:00noon

Venue:                  Room 4472
                        Lifts 25/26

Committee Members:      Prof. Ke Yi (Supervisor)
                        Dr. Dimitris Papadopoulos (Chairperson)
                        Dr. Amir Goharshday
                        Dr. Shuai Wang