More about HKUST
Two Popular Queries in Massive Multidimensional Datasets
PhD Thesis Proposal Defence Title: "Two Popular Queries in Massive Multidimensional Datasets" by Miss Wuman Luo ABSTRACT: The arrival of cyber-physical system era is changing data analysis in many ways. Driven by the advances in Internet and sensor techniques, the amount of multidimensional contents, such as images, trajectories, and video clips, has grown to an unprecedented level. Supporting multidimensional objects in large scale requires significant extensions from traditional databases. One critical issue is indexing and query processing. In this proposal, we discuss two types of queries in massive multidimensional datasets: high-dimensional similarity join and the most frequent path finding. In the first part of this proposal, we study how to perform parallel high-dimensional joins in the MapReduce paradigm. Specifically, we propose a cost model to demonstrate that it is important to take both communication and computation costs into accounts as dimensionality and data volume increases. To this end, we propose an efficient compression approach which can help significantly reduce both these costs. Moreover, we design two parallel frameworks which can scale up to massive data sizes and very high dimensionality. In the second part of this proposal, we address the problem of path finding by evaluating the desirability of a path from a novel perspective, i.e., how frequently the path has been taken within the given time constraints. This new query not only helps users to learn from the experiences of the past travelers, but also takes the variability of road and traffic conditions into account. To achieve this goal, we firstly design two indexes for efficient trajectory searching and splitting. After that, we develop a "footmark graph“ construction algorithm to calculate the road segment frequencies from raw trajectories. Finally, we propose a most ”frequent path finding“ algorithm based on the ”more frequent“ relation in a dynamic programming manner. Date: Friday, 15 June 2012 Time: 1:30pm - 3:30pm Venue: Room 3501 lifts 25/26 Committee Members: Prof. Lionel Ni (Supervisor) Dr. Qiong Luo (Chairperson) Dr. Lei Chen Dr. Lin Gu **** ALL are Welcome ****