More about HKUST
Range Counting Under Differential Privacy
PhD Thesis Proposal Defence Title: "Range Counting Under Differential Privacy" by Mr. Ziyue HUANG Abstract: The past few years have seen major advances in the area of machine learning and data analysis due to the access to substantial amounts of user data. However, the collected personal information, such as disease, salary, address and keyboard typing, is often quite sensitive. Differential privacy is a rigorous mathematical definition for privacy-preserving data publishing, which has been widely adopted nowadays. This thesis proposal mainly focuses on the fundamental problem of range counting under differential privacy, i.e., computing how many points in the dataset are lying inside any specific region from a certain query family (a.k.a range space). The query family we consider consists of geometric ranges of a certain type in constant-dimensional Euclidean space, e.g., points, (axis-parallel) rectangles, halfspaces, simplices, spheres, or arbitrarily-shaped regions. First, we study the range counting problem in the central model, where a trusted curator holds the entire dataset. Existing lower bounds based on discrepancy theory suggest that large errors have to be introduced in order to preserve privacy: Essentially for any range space (except axis-parallel rectangles), the error has to be polynomial. We show that by allowing a standard notion of geometric approximation where points near the boundary of the range may or may not be counted, the error can be reduced to logarithmic. Next, we study the frequency estimation problem under both privacy and communication constraints in the multiparty model, where the (one-shot or streaming) data is distributed among k parties. Our protocols achieve optimality (up to polylogarithmic factors) permissible by the more stringent of the two constraints. Besides applications of its own, a frequency estimation protocol can be used as a building block to solve many related problems such as heavy hitters, quantiles, and orthogonal range counting, at the cost of some extra polylogarithmic factors. Finally, we briefly describe our ongoing work. We study the mean estimation problem under differential privacy, where worst-case optimal mechanisms do not offer meaningful utility guarantees in practice when the global sensitivity is very large. We aim at instance optimality and give a principled approach without the need of parameter tuning, wherein an important building block is a private mechanism (based on range counting) for finding specific quantiles. Date: Wednesday, 16 June 2021 Time: 3:00pm - 5:00pm Zoom Meeting: https://hkust.zoom.com.cn/j/8230095287 Committee Members: Prof. Ke Yi (Supervisor) Dr. Sunil Arya (Chairperson) Prof. Mordecai Golin Prof. Siu-Wing Cheng **** ALL are Welcome ****