More about HKUST
On Big Spatial Data Processing: Emerging Metric & Novel Problem
The Hong Kong University of Science and Technology Department of Computer Science and Engineering PhD Thesis Defence Title: "On Big Spatial Data Processing: Emerging Metric & Novel Problem" By Mr. Junqiu WEI Abstract Spatial data processing is an important field in database and data mining which has a lot of real-world applications and serves as a crucial building block for other algorithms. In the past few decades, a lot of research effort was spent on this field. In a high-level view, spatial data processing spans a large body of problems and has two dimensions, namely metrics and problems. In terms of the metrics, the Euclidean space is the most fundamental one and was intensively and thoroughly studied in the past. With the advance of the geo-spatial positioning and the computer technologies, there are many new metrics emerging such as the road networks and the terrain surface, where the techniques in the tranditional Euclidean space may not efficiently apply. As a result, there are many open questions and research problems to be studied in these emerging metrics. In terms of the problems, the shortest distance computing is considered as the fundamental problem in each metric. Efficiently tackling the fundamental problem not only helps many real-world applications, but also paves the way for many more advanced problems. In this thesis, we push the frontier of the field of spatial data processing in both directions aforementioned (i.e., metrics and problems). Specifically, we studied a fundamental problem (i.e., shortest distance queries) on an emerging metric, namely terrain surface and proposed a new problem namely Nearby-fit Spatial Keyword Queries (NSKQ, in short) in the tranditional Euclidean space. For the shortest distance queries, we proposed an indexing structure, namely a distance oracle, for efficiently processing the query. By our experimental results, we accelerated the query processing by up to 2-5 orders of magnitudes compared with the state-of-the-art. This result renders many applications/algorithms real-time and shed light on the further research in this new metric. The NSKQ problem which lies in the Euclidean space is an important problem which captures a key feature for the POI needed in some real-world applications. Specifically, it ranks each POI not only by its own attributes (i.e, location and the keywords contained) but also its nearby POIs. This problem is challenging and was proved to be NP-hard. We proposed two approximate algorithms with best tradeoff between the running time and the accuracy compared with baselines considered. The experiments verified the efficiency and the effectiveness of our techniques. Date: Friday, 10 August 2018 Time: 3:00pm - 5:00pm Venue: Room 3494 Lifts 25/26 Chairman: Prof. Iam Keong Sou (PHYS) Committee Members: Prof. Raymond Wong (Supervisor) Prof. Dik-Lun Lee Prof. Frederick Lochovsky Prof. Jack Cheng (CIVL) Prof. Arbee Chen (Asia Univ) **** ALL are Welcome ****