On Big Spatial Data Processing: Emerging Metric & Novel Problem

The Hong Kong University of Science and Technology
Department of Computer Science and Engineering


PhD Thesis Defence


Title: "On Big Spatial Data Processing: Emerging Metric & Novel Problem"

By

Mr. Junqiu WEI


Abstract

Spatial data processing is an important field in database and data mining which 
has a lot of real-world applications and serves as a crucial building block for 
other algorithms. In the past few decades, a lot of research effort was spent 
on this field. In a high-level view, spatial data processing spans a large body 
of problems and has two dimensions, namely metrics and problems. In terms of 
the metrics, the Euclidean space is the most fundamental one and was 
intensively and thoroughly studied in the past. With the advance of the 
geo-spatial positioning and the computer technologies, there are many new 
metrics emerging such as the road networks and the terrain surface, where the 
techniques in the tranditional Euclidean space may not efficiently apply. As a 
result, there are many open questions and research problems to be studied in 
these emerging metrics. In terms of the problems, the shortest distance 
computing is considered as the fundamental problem in each metric. Efficiently 
tackling the fundamental problem not only helps many real-world applications, 
but also paves the way for many more advanced problems.

In this thesis, we push the frontier of the field of spatial data processing in 
both directions aforementioned (i.e., metrics and problems). Specifically, we 
studied a fundamental problem (i.e., shortest distance queries) on an emerging 
metric, namely terrain surface and proposed a new problem namely Nearby-fit 
Spatial Keyword Queries (NSKQ, in short) in the tranditional Euclidean space. 
For the shortest distance queries, we proposed an indexing structure, namely a 
distance oracle, for efficiently processing the query. By our experimental 
results, we accelerated the query processing by up to 2-5 orders of magnitudes 
compared with the state-of-the-art. This result renders many 
applications/algorithms real-time and shed light on the further research in 
this new metric. The NSKQ problem which lies in the Euclidean space is an 
important problem which captures a key feature for the POI needed in some 
real-world applications. Specifically, it ranks each POI not only by its own 
attributes (i.e, location and the keywords contained) but also its nearby POIs. 
This problem is challenging and was proved to be NP-hard. We proposed two 
approximate algorithms with best tradeoff between the running time and the 
accuracy compared with baselines considered. The experiments verified the 
efficiency and the effectiveness of our techniques.


Date:			Friday, 10 August 2018

Time:			3:00pm - 5:00pm

Venue:			Room 3494
 			Lifts 25/26

Chairman:		Prof. Iam Keong Sou (PHYS)

Committee Members:	Prof. Raymond Wong (Supervisor)
 			Prof. Dik-Lun Lee
 			Prof. Frederick Lochovsky
 			Prof. Jack Cheng (CIVL)
 			Prof. Arbee Chen (Asia Univ)


**** ALL are Welcome ****