Random Sampling on Big Data: Techniques and Applications

Speaker:        Professor Ke YI
                Department of Computer Science and Engineering
                Hong Kong University of Science and Technology

Title:          "Random Sampling on Big Data: Techniques and
                 Applications"

Date:           Monday, 14 Oct 2019

Time:           4:00pm - 5:00pm

Venue:          Lecture Theater F (near lift no. 25/26), HKUST

Abstract:

Random sampling is a powerful tool for big data analytics. It can be used
whenever complete accuracy is not required, while offering
order-of-magnitude improvements in query efficiency. Random sampling has
been extensively studied in both the statistics and computer science
literature. This talk will take a "sample" of this huge literature.  In
particular, we will discuss random sampling over streaming and distributed
data, importance sampling, merge-reduce sampling, and sampling for
approximate query processing.


*****************
Biography:

Ke Yi is a Professor in the Department of Computer Science and
Engineering, Hong Kong University of Science and Technology. He obtained
his Bachelor's degree from Tsinghua University (2001) and PhD from Duke
University (2006), both in computer science. His research spans
theoretical computer science and database systems. He has received a
Google Faculty Research Award (2010), the Young Investigator Research
Award from HKUST (2012), a SIGMOD Best Demonstration Award (2015), and the
SIGMOD Best Paper Award (2016). He currently serves as an Associate Editor
of ACM Transactions on Database Systems and IEEE Transactions on Knowledge
and Data Engineering.