More about HKUST
Computing summaries over Distributed Data
The Hong Kong University of Science and Technology Department of Computer Science and Engineering PhD Thesis Defence Title: "Computing summaries over Distributed Data" By Mr. Zengfeng HUANG Abstract Consider a distributed system with k nodes, where each node holds a part of the data. The goal is to extract some useful information from the entire data set or to compute some functions over the data. We are interested in designing communication-efficient algorithms and also characterizing the communication complexity for various problems. We consider both a flat network structure and more complicated tree networks. In this thesis, we study some most important statistical summaries of the underlying data, in particular item frequencies, heavy hitters, quantiles, and eps-approximations, which are extensively studied in database, machine learning, computational geometry and data mining. We provide general techniques for both designing efficient algorithms and proving communication lower bounds, with which we get almost tight bounds for these problems. We also study graph problems in the distributed setting, where the edges of the input graph is partitioned across k nodes. We show how to compute an approximate maximum matching, one of most important graph problems, communication-efficiently. We also prove a tight lower bound for this problem. To prove this bound, we develop new techniques, which we believe will have a wide applicability to prove distributed communication complexity for other graph problems. Date: Friday, 16 August 2013 Time: 10:00am – 12:00noon Venue: Room 3501 Lifts 25/26 Chairman: Prof. Kam Sing Wong (PHYS) Committee Members: Prof. Ke Yi (Supervisor) Prof. Sunil Arya Prof. Siu-Wing Cheng Prof. Ajay Joneja (IELM) Prof. Xiaoming Sun (CAS, Beijing) **** ALL are Welcome ****