More about HKUST
CROSS-MATCHING BIG ASTRONOMIC CATALOGS ON HETEROGENEOUS CLUSTERS
The Hong Kong University of Science and Technology Department of Computer Science and Engineering PhD Thesis Defence Title: "CROSS-MATCHING BIG ASTRONOMIC CATALOGS ON HETEROGENEOUS CLUSTERS" By Miss Xiaoying JIA Abstract In astronomy, cross-match is a central operation to integrate multi-wavelength information by identifying celestial objects across multiple catalogs. With the rapid increase in data volume from space and ground-based surveys, it becomes mandatory to process large astronomic catalogs efficiently. In this thesis, we study how to accelerate the cross-match of billion-record catalogs on a cluster of heterogeneous computers with both CPUs and GPUs. Specifically, we present two cross-match algorithms, namely IB-CM (Index-Based Cross-Match) and MASJ-CM (Multi-Assignment Single-Join Cross-Match), and study the performance impact of indexing methods as well as design choices and optimizations of both algorithms for a heterogeneous computer cluster. We have implemented these algorithms fully utilizing the computation and communication resources of the cluster, and compared with those on Spark and SpatialHadoop, two popular distributed computing platforms. Our evaluations on real-world astronomic catalogs show that our native implementations were orders of magnitude faster than those on Spark or SpatialHadoop and that self-matching billion-record catalogs on a six-node cluster finished under five minutes. Date: Wednesday, 26 July 2017 Time: 10:00am - 12:00noon Venue: Room 2612B Lifts 31/32 Chairman: Prof. Zhenyang Lin (CHEM) Committee Members: Prof. Qiong Luo (Supervisor) Prof. Lei Chen Prof. Raymond Wong Prof. Wei Zhang (ECE) Prof. Xiaowen Chu (Comp. Sci., Baptist U) **** ALL are Welcome ****