Simba: Towards Building Interactive Big Spatial Data Systems

Speaker:        Prof. Feifei Li
                School of Computing
                University of Utah

Title:          "Simba: Towards Building Interactive Big Spatial
                 Data Systems"

Date:           Monday, 5 December 2016

Time:           11:00am to 12 noon

Venue:          Lecture Theater F (near lifts 25/26), HKUST

Abstract:

Interactive queries over large spatial data becomes a critical requirement
in many applications. As a result, it is critical to provide
fast, scalable, and high-throughput query processing and analytics
for numerous applications. We will present the Simba system that
offers scalable and efficient in-memory spatial query processing
and analytics for big spatial and multimedia data. Simba is based on
Spark and runs over a cluster of commodity machines. In
particular, Simba extends the Spark SQL engine to support rich spatial
queries and analytics through both SQL and the DataFrame API. It
introduces the concept and construction of indexes over RDDs in order to
work with big spatial data and complex spatial operations.
Lastly, Simba implements an effective query optimizer, which leverages its
indexes and novel spatial-aware optimizations, to achieve both low latency
and high throughput. Extensive experiments over large data sets
demonstrate Simba's superior performance compared against other big data
analytics system. Through its SQL and DataFrame API, Simba provides
interactive analytics over big data, but when data grows too big
and/or computation becomes too expensive, we will achieve
interactive analytics through online analytics.

We will also review related studies on building interactive and online
query engines and our recent efforts in extending Simba with machine
learning driven analytics.


******************
Biography:

Feifei Li is currently an associate professor at the School of Computing,
University of Utah. He obtained his Bachelor's degree from Nanyang
Technological University (transferred from Tsinghua University) in 2001
and PhD from Boston University in 2007. His research focuses on improving
the scalability, efficiency, and effectiveness of database and big data
systems. He also works on data security problems in these systems. He was
a recipient for an NSF career award in 2011, two HP IRP awards in 2011 and
2012 respectively, a Google App Engine award in 2013, an IEEE ICDE best
paper award in 2004, the IEEE ICDE 10+ Years Most Influential Paper Award
in 2014, a Google Faculty award in 2015, SIGMOD Best Demonstration Award
in SIGMOD 2015, and the SIGMOD 2016 Best Paper Award. He is/was the demo
PC co-chair for SIMGOD 2018, the demo PC co-chair for VLDB 2014, the
general co-chair for SIGMOD 2014, a PC area chair for ICDE 2014 and SIGMOD
2015, and currently serves as an associate editor for IEEE TKDE. He is a
member of he SIGMOD Jim Gray PhD Dissertation Award Committee, starting
2017.