More about HKUST
Constructing Synopses for Query Answering
The Hong Kong University of Science and Technology Department of Computer Science and Engineering PhD Thesis Defence Title: "Constructing Synopses for Query Answering" By Mr. Yuan QIU Abstract Synopses, also known as summaries or sketches, are data structures computed from a dataset to support analytical queries. Traditionally, the purpose of using synopses is efficiency. By allocating some extra space, queries can be answered approximately but efficiently by a precomputed synopsis without referencing the original dataset. Recently, privacy has been a new concern besides efficiency. As the current standard for privacy analysis, differential privacy (DP) enjoys the property that post-processing a DP mechanism brings no extra privacy loss. This is in line with the principle of synopses: by constructing a differentially private synopsis from the dataset, many queries can be answered accurately while satisfying a predefined privacy constraint at construction time. In this thesis, we focus on three problems related to synopses construction We start with the cardinality estimation problem for Select-Project-Join queries under the non-private setting. We propose a synopsis constructed through weighted distinct sampling according to near-optimal sampling rates, and show an efficient way of constructing the it. We demonstrate its optimality through both theoretical and empirical evaluation. We then consider numerical queries on static datasets under differential privacy, which capture Select-Aggregate queries in database systems. A private synopsis is designed to achieve both query-specific and instance-specific error, which outperforms existing solutions both in theory and in practice. Finally, We extend the setting to streaming data. A private synopsis is designed for linear queries, which are generalizations of Select queries in databases. We show our synopsis for fully-dynamic streams has asymptotically the same error as answering queries in static settings, ignoring polylogarithmic factors in the length of the stream. The results can also be extended to arbitrary union-preserving queries. Date: Wednesday, 10 August 2022 Time: 10:00am - 12:00noon Zoom Meeting: https://hkust.zoom.us/j/5688510375 Chairperson: Prof. Juanyi XU (ECON) Committee Members: Prof. Ke YI (Supervisor) Prof. Siu Wing CHENG Prof. Mordecai GOLIN Prof. Yi YANG (ISOM) Prof. Hubert CHAN (HKU) **** ALL are Welcome ****