Query Evaluation under Differential Privacy

The Hong Kong University of Science and Technology
Department of Computer Science and Engineering


PhD Thesis Defence


Title: "Query Evaluation under Differential Privacy"

By

Mr. Wei DONG


Abstract

Differential privacy (DP) has garnered significant attention from both 
academia and industry due to its potential in offering robust privacy 
protection for individual data during analysis. With the increasing volume 
of sensitive information being collected by organizations and analyzed 
through SQL queries, the development of a general-purpose query engine 
that is capable of supporting a broad range of SQLs while maintaining 
differential privacy has become the holy grail in privacy-preserving query 
release. Over the past several years, the database and security 
communities have made numerous efforts to achieve this objective. However, 
two major challenges still exist:

Challenge 1: Privacy guarantee in relational databases. Informally 
speaking, DP requires indistinguishability of the query results whether 
any particular individual’s data is included or not in the database. While 
this can be easily defined and well studied over a single flat table, the 
situation becomes more complex in a relational database with the presence 
of multiple relations, foreign keys, and the join operator.

Challenge 2: Optimal privacy-utility trade-off. Noise injection is 
inherently necessary for privacy protection, but the central scientific 
question is how to achieve the lowest error (i.e., highest utility) under 
a given privacy budget. Unfortunately, for the problem of relational query 
evaluation under DP, traditional notions of optimality, i.e., instance 
optimality and worst-case optimality, are either unattainable or 
meaningless. Thus, a new notion of optimality is needed for answering this 
question.

In this thesis, we aim at tackling these challenges. To model the complex 
situation of relational databases, we study two different DP policies: 
tuple-DP, which protects the privacy of single tuples in each relation, 
and user-DP, which protects all data belonging to each user via foreign 
keys. Under each policy, we have designed DP mechanisms for answering a 
broad class of queries consisting of the selection, projection, 
aggregation, and join operators. To formalize their optimality, we 
introduce the notion of neighborhood optimality, which sits between 
instance optimality and worst-case optimality. Finally, based on the 
algorithms and theory developed, we have built a DP-SQL system that 
significantly outperforms existing systems in terms of both utility and 
efficiency.


Date:			Tuesday, 23 May 2023

Time:			10:00am - 12:00noon

Venue:			Room 5566
 			lifts 27/28

Chairperson:		Prof. Hongyu YU (MAE)

Committee Members:	Prof. Ke YI (Supervisor)
 			Prof. Dimitrios PAPADIAS
 			Prof. Dimitris PAPADOPOULOS
 			Prof. Jiheng ZHANG (IEDA)
 			Prof. Xiaokui XIAO (NUS)


**** ALL are Welcome ****