More about HKUST
Research on Small Data for Locality of References and on Big Data for Wisdom of Crowds
Speaker: Professor Xiaodong Zhang The Ohio State University USA Title: "Research on Small Data for Locality of References and on Big Data for Wisdom of Crowds" Date: Thursday, 9 January 2014 Time: 2:00pm - 3:30pm Venue: Chow Tak Sin Lecture Theater (LT-G, near lifts 25/26), HKUST Abstract: Computer science R&D has become data centric. Data access patterns in various applications are commonly characterized by the rank of popularity in a two dimensional graph: the data items are sorted (in the horizontal axis) by their access frequencies (in the vertical axis). Power law is a well-known curve for this purpose, where a small portion of data (small data) receives highly frequent accesses representing the locality of references, and a large portion of data (big data) receives infrequent accesses represented by a long tail in the graph. In the past 40 years, computer science researchers in both systems and applications made effective efforts on exploit the locality of small data by building deep memory hierarchies in computer systems. As the rapid advancements of computing and storage technologies, big data in the long tail has been dramatically increased and permanently archived, which has started to attract strong interests in many areas for both research and productions. In this talk, I will present several case studies on both small data and big data to show the distinguished natures and purposes on these two types of studies. For small data, the access patterns are largely predictable for software and hardware design, and the main purpose is to improve the performance and system utilization. For big data, the access patterns are largely non-predicable, thus, the data processing environment must be massive in a scalable and low cost mode. The purpose of big data processing is to discover new knowledge deeply hidden in the ocean of data. ********************** Biography: Xiaodong Zhang is the Robert M. Critchfield Professor in Engineering and Chair of the Computer Science and Engineering Department at the Ohio State University. His research interests focus on data management in computer and distributed systems. He has made strong efforts to transfer his academic research into advanced technologies to update the design and implementation of major general-purpose computing systems. He received his Ph.D. in Computer Science from University of Colorado at Boulder, where he received Distinguished Engineering Alumni Award in 2011. He is a Fellow of the ACM, and a Fellow of the IEEE.