ACCELERATING PROTEIN STRUCTURE ALIGNMENT WITH MULTICORE-CPUS AND GRAPHIC PROCESSORS

MPhil Thesis Defence


Title: "ACCELERATING PROTEIN STRUCTURE ALIGNMENT WITH MULTICORE-CPUS AND 
GRAPHIC PROCESSORS"

By

Mr. Yishui WU


Abstract

Among a variety of protein analysis tools, the pairwise protein structural 
alignment tool is the one used to identify structural similarities between two 
different proteins. It is widely applied to analyze the functionalities and 
evolutionary relationships of proteins. With the rapid growth of new 3D protein 
structures and the further development of protein research work, protein 
structural alignment becomes more and more important in this field.

Compared with different structural alignment tools, DALIX is the one capable of 
calculating an optimal structural alignment based on DALI score. It outperforms 
DALI, one of the most popular structural alignment algorithms, in most cases 
when aligning relatively short proteins. However, DALIX has high time 
complexity, which restricts its practical application.

In this work, we design parallel versions of DALIX to accelerate it. We 
implement a coarse-grained parallel algorithm using OPENMP on multicore-CPUs 
platforms and also a fine-grained parallel algorithm using CUDA on GPUs. In 
both versions, we parallelize the following components of DALIX: 1) Computation 
of the DALI score array, 2) algorithm engineering, 3) 2-level dynamic 
programming and 4) updating of the Lagrangian multipliers. In particular, to 
better utilize the massive GPU thread parallelism, in the CUDA version of 
DALIX, we design a 2-level parallel algorithm for the 2-level dynamic 
programming procedure. We compact the decision table of dynamic programming so 
that it can fit into the shared memory for inter thread communication, which 
improves the performance. We have compared the two parallel algorithms with the 
serial version of DALIX on a set of protein alignments. The CUDA version is run 
on an NVIDIA GTX960 GPU while the OPENMP version and the serial version are 
executed on a platform equipped with 2 Intel Xeon E5620 CPUs. The results show 
that the CUDA version outperforms the serial version with a speedup ranging 
from 5.5x to 20x. The OPENMP version also has a speedup of around 7x.


Date:			Thursday, 2 June 2016

Time:			10:30am - 12:30pm

Venue:			Room 5506
 			Lifts 25/26

Committee Members:	Dr. Qiong Luo (Supervisor)
 			Prof. Nevin Zhang (Chairperson)
 			Dr. Weichuan Yu (ECE)


**** ALL are Welcome ****