More about HKUST
Simba: Towards Building Interactive Big Spatial Data Systems
Speaker: Prof. Feifei Li School of Computing University of Utah Title: "Simba: Towards Building Interactive Big Spatial Data Systems" Date: Monday, 5 December 2016 Time: 11:00am to 12 noon Venue: Lecture Theater F (near lifts 25/26), HKUST Abstract: Interactive queries over large spatial data becomes a critical requirement in many applications. As a result, it is critical to provide fast, scalable, and high-throughput query processing and analytics for numerous applications. We will present the Simba system that offers scalable and efficient in-memory spatial query processing and analytics for big spatial and multimedia data. Simba is based on Spark and runs over a cluster of commodity machines. In particular, Simba extends the Spark SQL engine to support rich spatial queries and analytics through both SQL and the DataFrame API. It introduces the concept and construction of indexes over RDDs in order to work with big spatial data and complex spatial operations. Lastly, Simba implements an effective query optimizer, which leverages its indexes and novel spatial-aware optimizations, to achieve both low latency and high throughput. Extensive experiments over large data sets demonstrate Simba's superior performance compared against other big data analytics system. Through its SQL and DataFrame API, Simba provides interactive analytics over big data, but when data grows too big and/or computation becomes too expensive, we will achieve interactive analytics through online analytics. We will also review related studies on building interactive and online query engines and our recent efforts in extending Simba with machine learning driven analytics. ****************** Biography: Feifei Li is currently an associate professor at the School of Computing, University of Utah. He obtained his Bachelor's degree from Nanyang Technological University (transferred from Tsinghua University) in 2001 and PhD from Boston University in 2007. His research focuses on improving the scalability, efficiency, and effectiveness of database and big data systems. He also works on data security problems in these systems. He was a recipient for an NSF career award in 2011, two HP IRP awards in 2011 and 2012 respectively, a Google App Engine award in 2013, an IEEE ICDE best paper award in 2004, the IEEE ICDE 10+ Years Most Influential Paper Award in 2014, a Google Faculty award in 2015, SIGMOD Best Demonstration Award in SIGMOD 2015, and the SIGMOD 2016 Best Paper Award. He is/was the demo PC co-chair for SIMGOD 2018, the demo PC co-chair for VLDB 2014, the general co-chair for SIGMOD 2014, a PC area chair for ICDE 2014 and SIGMOD 2015, and currently serves as an associate editor for IEEE TKDE. He is a member of he SIGMOD Jim Gray PhD Dissertation Award Committee, starting 2017.