More about HKUST
Computing Statistical summaries on Massive Distributed Data
PhD Thesis Proposal Defence Title: "Computing Statistical summaries on Massive Distributed Data" by Mr. Zengfeng Huang ABSTRACT: Consider a distributed system with k nodes, where each node holds a part of the data. Our the goal is to design communication-efficient algorithms for computing functions over the entire data set. In this thesis, we focus on computing some most important statistical summaries of the underlying data, in particular item frequencies, heavy hitters, quantiles, and eps-approximations. We will consider both a flat network structure and more complicated tree networks. We give efficient algorithms with communication costs that scale sublinearly in the size of the communication network. We also give almost tight lower bounds, both deterministic and randomized, for all the problems we study in this thesis. Date: Wednesday, 30 January 2013 Time: 4:00pm - 6:00pm Venue: Room 3494 lifts 25/26 Committee Members: Dr. Ke Yi (Supervisor) Prof. Siu-Wing Cheng (Chairperson) Dr. Sunil Arya Prof. Mordecai Golin **** ALL are Welcome ****