Accelerating Genome Sequence Analysis

The Hong Kong University of Science and Technology
Department of Computer Science and Engineering


PhD Thesis Defence


Title: "Accelerating Genome Sequence Analysis"

By

Mr. Mian LU


Abstract

Genome sequence analysis is central to today's genomics research. 
Particularly, sequence alignment and Single-Nucleotide Polymorphism (SNP) 
detection are two fundamental tasks in the analysis. Sequence alignment, 
in particular, short read alignment, matches DNA fragments generated from 
second-generation sequencers to a reference sequence. Subsequently, 
through SNP detection, the variation on a single nucleotide is identified 
between each aligned read and the reference sequence. As these analysis 
tasks handle millions to billions of base pairs of gene data and perform 
intensive computation, we accelerate the analysis system by (1) improving 
the I/O, memory access, and computation of each task; and (2) tightly 
integrating the two tasks to reduce redundancy.

Specifically, we explore the use of graphics processors, or the GPU, as a 
hardware accelerator to speed up the in-memory computation for both tasks. 
In particular, we propose a filtering-verification algorithm for sequence 
alignment that better utilizes the GPU's hardware resource; we design a 
sparse data representation format to improve memory access on the GPU in 
SNP detection. Finally, we adopt a partition-based alignment storage 
layout and customized data compression techniques to reduce the I/O cost 
and to improve the overall speed of the system. Experimental results show 
that our system accelerates state-of-the-art tools by an order of 
magnitude.


Date:			Wednesday, 8 August 2012

Time:			2:00pm – 4:00pm

Venue:			Room 3501
 			Lifts 25/26

Chairman:		Prof. Xiangtong Qi (IELM)

Committee Members:	Prof. Qiong Luo (Supervisor)
 			Prof. Frederick Lochovsky
 			Prof. Raymond Wong
 			Prof. Jun Xia (LIFS)
                         Prof. Xiaowen Chu (Comp. Sci., BaptistU)


**** ALL are Welcome ****