More about HKUST
Computing summaries over Distributed Data
The Hong Kong University of Science and Technology
Department of Computer Science and Engineering
PhD Thesis Defence
Title: "Computing summaries over Distributed Data"
By
Mr. Zengfeng HUANG
Abstract
Consider a distributed system with k nodes, where each node holds a part
of the data. The goal is to extract some useful information from the
entire data set or to compute some functions over the data. We are
interested in designing communication-efficient algorithms and also
characterizing the communication complexity for various problems. We
consider both a flat network structure and more complicated tree networks.
In this thesis, we study some most important statistical summaries of the
underlying data, in particular item frequencies, heavy hitters, quantiles,
and eps-approximations, which are extensively studied in database, machine
learning, computational geometry and data mining. We provide general
techniques for both designing efficient algorithms and proving
communication lower bounds, with which we get almost tight bounds for
these problems.
We also study graph problems in the distributed setting, where the edges
of the input graph is partitioned across k nodes. We show how to compute
an approximate maximum matching, one of most important graph problems,
communication-efficiently. We also prove a tight lower bound for this
problem. To prove this bound, we develop new techniques, which we believe
will have a wide applicability to prove distributed communication
complexity for other graph problems.
Date: Friday, 16 August 2013
Time: 10:00am – 12:00noon
Venue: Room 3501
Lifts 25/26
Chairman: Prof. Kam Sing Wong (PHYS)
Committee Members: Prof. Ke Yi (Supervisor)
Prof. Sunil Arya
Prof. Siu-Wing Cheng
Prof. Ajay Joneja (IELM)
Prof. Xiaoming Sun (CAS, Beijing)
**** ALL are Welcome ****