More about HKUST
Parallelizing De Novo Assembly with Heterogeneous Processors
PhD Thesis Proposal Defence Title: "Parallelizing De Novo Assembly with Heterogeneous Processors" by Miss Shuang QIU Abstract: De Novo assemblers construct genome sequences from small fragments, without using any reference genome. Specifically, they represent the fragments in a De Bruijn graph and traverse the graph to generate the sequence. As constructing and traversing a big De Bruijn graph is both time and memory space consuming, we develop ParaGraph, a parallel software package that runs this process on a cluster of GPU-equipped computers. In particular, it utilizes all processor cores in each CPU and GPU, all CPUs and GPUs in a computer node, and all computer nodes of the cluster. Furthermore, we analyze the characteristics of genome data to design a concurrent hashing algorithm for the graph construction, and to reduce the communication overhead in the graph traversal. We further improve the overall performance by partitioning and storing the data in a compact format, pipelining data transfer and computation, and overlapping computation and communication. Our experiments show that on real-world datasets, ParaGraph is an order of magnitude faster than the state-of-the-art shared memory based assemblers, and more than five times faster than the current distributed assemblers. Date: Wednesday, 29 August 2018 Time: 10:00am - 12:00pm Venue: Room 3494 (lifts 25/26) Committee Members: Dr. Qiong Luo (Supervisor) Dr. Wilfred Ng (Chairperson) Dr. Ke Yi Prof. Weichuan Yu (ECE) **** ALL are Welcome ****