Practical Differential Privacy

PhD Thesis Proposal Defence


Title: "Practical Differential Privacy"

by

Mr. Georgios KELLARIS


Abstract:

In this proposal we focus on publishing statistics on a private database with 
epsilon-differential privacy. We focus on two scenarios; (i) when the 
statistics are computed over a static database, and (ii) when the statistics 
are published continuously over data that are updated by a stream.

For the first scenario, we address one-time publishing of non-overlapping 
counts. These statistics are useful in a wide and important range of 
applications, including transactional, traffic and medical data analysis. Prior 
work on epsilon-differential privacy publishes such statistics with 
prohibitively low utility in several practical scenarios. Towards this end, we 
present GS, a method that pre-processes the counts by elaborately grouping and 
smoothing them via averaging. This step acts as a form of preliminary 
perturbation that diminishes sensitivity, and enables GS to achieve 
epsilon-differential privacy through low Laplace noise injection. The grouping 
strategy is dictated by a sampling mechanism, which minimizes the smoothing 
perturbation. We demonstrate the superiority of GS over its competitors, and 
confirm its practicality, via extensive experiments on real datasets.

For the second scenario, we address continuously publishing of non-overlapping 
counts. Numerous applications require continuous publication of statistics for 
monitoring purposes, such as real-time traffic analysis, timely disease 
outbreak discovery, and social trends observation. These statistics may be 
derived from sensitive user data and, hence, necessitate privacy preservation. 
Although epsilon-differential privacy is a notable paradigm for offering strong 
privacy guarantees in statistics publishing, there is limited literature that 
adapts this concept to settings where the statistics are computed over an 
infinite stream of "events" (i.e., data items generated by the users), and 
published periodically. These works aim at hiding a single event over the 
entire stream. We argue that, in most practical scenarios, sensitive 
information is revealed from multiple events occurring at contiguous time 
instances. Towards this end, we put forth the novel notion of w-event privacy 
over infinite streams, which protects an event sequence occurring in w 
successive time instants. We first formulate our privacy concept, motivate its 
importance, and introduce a methodology for achieving it. We next design two 
instantiations, whose utility is independent of the stream length. Finally, we 
confirm the practicality of our solutions experimenting with real data.


Date:			Thursday, 29 May 2014

Time:                   1:00pm - 3:00pm

Venue:                  Room 3501
                         lifts 25/26

Committee Members:	Prof. Dimitris Papadias (Supervisor)
 			Dr. Raymond Wong (Chairperson)
 			Prof. Dik-Lun Lee
 			Dr. Qiong Luo


**** ALL are Welcome ****