More about HKUST
Parallelizing De Novo Assembly with Heterogeneous Processors
The Hong Kong University of Science and Technology Department of Computer Science and Engineering PhD Thesis Defence Title: "Parallelizing De Novo Assembly with Heterogeneous Processors" By Miss Shuang QIU Abstract De Novo assemblers construct genome sequences from small fragments, without using any reference genome. Specifically, they represent the fragments in a De Bruijn graph and traverse the graph to generate the sequence. As constructing and traversing a big De Bruijn graph is both time and memory space consuming, we develop UNIPAR, a parallel software package that runs this process on a cluster of GPU-equipped computers. In particular, it utilizes all processor cores in each CPU and GPU, all CPUs and GPUs in a computer node, and all computer nodes of the cluster. Furthermore, we analyze the characteristics of genome data to design a concurrent hashing algorithm for the graph construction, and to reduce the communication overhead in the graph traversal. We further improve the overall performance by partitioning and storing the data in a compact format, pipelining data transfer and computation, and overlapping computation and communication. Our experiments show that on real-world datasets, UNIPAR is an order of magnitude faster than the state-of-the-art shared memory based assemblers, and more than five times faster than the current distributed assemblers. Date: Thursday, 16 May 2019 Time: 10:00am - 12:00noon Venue: Room 3494 Lifts 25/26 Chairman: Prof. Ki Ling Cheung (ISOM) Committee Members: Prof. Qiong Luo (Supervisor) Prof. Wilfred Ng Prof. Ke Yi Prof. Weichuan Yu (ECE) Prof. Xiaowen Chu (Baptist U) **** ALL are Welcome ****