Research on Small Data for Locality of References and on Big Data for Wisdom of Crowds

Speaker:        Professor Xiaodong Zhang
                The Ohio State University
                USA

Title:          "Research on Small Data for Locality of References and on
                 Big Data for Wisdom of Crowds"

Date:           Thursday, 9 January 2014

Time:           2:00pm - 3:30pm

Venue:          Chow Tak Sin Lecture Theater
                (LT-G, near lifts 25/26), HKUST

Abstract:

Computer science R&D has become data centric. Data access patterns in
various applications are commonly characterized by the rank of popularity
in a two dimensional graph: the data items are sorted (in the horizontal
axis) by their access frequencies (in the vertical axis). Power law is a
well-known curve for this purpose, where a small portion of data (small
data) receives highly frequent accesses representing the locality of
references, and a large portion of data (big data) receives infrequent
accesses represented by a long tail in the graph. In the past 40 years,
computer science researchers in both systems and applications made
effective efforts on exploit the locality of small data by building deep
memory hierarchies in computer systems.  As the rapid advancements of
computing and storage technologies, big data in the long tail has been
dramatically increased and permanently archived, which has started to
attract strong interests in many areas for both research and productions.

In this talk, I will present several case studies on both small data and
big data to show the distinguished natures and purposes on these two types
of studies. For small data, the access patterns are largely predictable
for software and hardware design, and the main purpose is to improve the
performance and system utilization. For big data, the access patterns are
largely non-predicable, thus, the data processing environment must be
massive in a scalable and low cost mode. The purpose of big data
processing is to discover new knowledge deeply hidden in the ocean of
data.


**********************
Biography:

Xiaodong Zhang is the Robert M. Critchfield Professor in Engineering and
Chair of the Computer Science and Engineering Department at the Ohio State
University. His research interests focus on data management in computer
and distributed systems. He has made strong efforts to transfer his
academic research into advanced technologies to update the design and
implementation of major general-purpose computing systems. He received his
Ph.D. in Computer Science from University of Colorado at Boulder, where he
received Distinguished Engineering Alumni Award in 2011. He is a Fellow of
the ACM, and a Fellow of the IEEE.