More about HKUST
Range Counting Under Differential Privacy
The Hong Kong University of Science and Technology Department of Computer Science and Engineering PhD Thesis Defence Title: "Range Counting Under Differential Privacy" By Mr. Ziyue HUANG Abstract The past few years have seen major advances in machine learning and data analysis due to the access to substantial amounts of user data. However, the collected personal information is often quite sensitive. Differential privacy is a rigorous mathematical definition for privacy-preserving data publishing, which has been widely adopted nowadays. This thesis mainly focuses on the fundamental problem of range counting under differential privacy, i.e., computing how many points are lying inside any specific region from a certain range space, and its applications. The query family we consider consists of geometric ranges of a certain type, e.g., points, rectangles, halfspaces, spheres, or arbitrarily-shaped regions. First, we study the range counting problem in the centralized model, where a trusted curator holds the entire dataset. Existing lower bounds based on discrepancy theory suggest that large errors have to be introduced in order to preserve privacy: Essentially for any range space (except axis-parallel rectangles), the error has to be polynomial. We show that by allowing a standard notion of geometric approximation where points near the boundary of the range may or may not be counted, the error can be reduced to logarithmic. Next, we study the mean estimation problem under differential privacy, where worst-case optimal mechanisms do not offer meaningful utility guarantees in practice when the global sensitivity is very large. We take a principled approach, yielding a mechanism that is instance-optimal in a strong sense. The mechanism is based on private quantile estimation that relies on range counting mechanisms. Finally, we study the frequency estimation problem under both privacy and communication constraints in the multiparty model, where the (one-shot or streaming) data is distributed among multiple parties. Our protocols achieve optimality (up to polylogarithmic factors) permissible by the more stringent of the two constraints. Date: Tuesday, 18 January 2022 Time: 10:00am - 12:00noon Venue: Room 3494 Lifts 25/26 Chairperson: Prof. Weichuan YU (ECE) Committee Members: Prof. Ke YI (Supervisor) Prof. Sunil ARYA Prof. Siu Wing CHENG Prof. Yuan YAO (MATH) Prof. Hubert CHAN (HKU) **** ALL are Welcome ****