Range Counting Under Differential Privacy

PhD Thesis Proposal Defence


Title: "Range Counting Under Differential Privacy"

by

Mr. Ziyue HUANG


Abstract:

The past few years have seen major advances in the area of machine 
learning and data analysis due to the access to substantial amounts of 
user data. However, the collected personal information, such as disease, 
salary, address and keyboard typing, is often quite sensitive. 
Differential privacy is a rigorous mathematical definition for 
privacy-preserving data publishing, which has been widely adopted 
nowadays.

This thesis proposal mainly focuses on the fundamental problem of range 
counting under differential privacy, i.e., computing how many points in 
the dataset are lying inside any specific region from a certain query 
family (a.k.a range space). The query family we consider consists of 
geometric ranges of a certain type in constant-dimensional Euclidean 
space, e.g., points, (axis-parallel) rectangles, halfspaces, simplices, 
spheres, or arbitrarily-shaped regions. First, we study the range counting 
problem in the central model, where a trusted curator holds the entire 
dataset. Existing lower bounds based on discrepancy theory suggest that 
large errors have to be introduced in order to preserve privacy: 
Essentially for any range space (except axis-parallel rectangles), the 
error has to be polynomial. We show that by allowing a standard notion of 
geometric approximation where points near the boundary of the range may or 
may not be counted, the error can be reduced to logarithmic. Next, we 
study the frequency estimation problem under both privacy and 
communication constraints in the multiparty model, where the (one-shot or 
streaming) data is distributed among k parties. Our protocols achieve 
optimality (up to polylogarithmic factors) permissible by the more 
stringent of the two constraints. Besides applications of its own, a 
frequency estimation protocol can be used as a building block to solve 
many related problems such as heavy hitters, quantiles, and orthogonal 
range counting, at the cost of some extra polylogarithmic factors. 
Finally, we briefly describe our ongoing work. We study the mean 
estimation problem under differential privacy, where worst-case optimal 
mechanisms do not offer meaningful utility guarantees in practice when the 
global sensitivity is very large. We aim at instance optimality and give a 
principled approach without the need of parameter tuning, wherein an 
important building block is a private mechanism (based on range counting) 
for finding specific quantiles.


Date:			Wednesday, 16 June 2021

Time:                  	3:00pm - 5:00pm

Zoom Meeting: 		https://hkust.zoom.com.cn/j/8230095287

Committee Members:	Prof. Ke Yi (Supervisor)
 			Dr. Sunil Arya (Chairperson)
 			Prof. Mordecai Golin
 			Prof. Siu-Wing Cheng


**** ALL are Welcome ****