High-throughput Protein Sequence Alignment on Multi-core Systems

Authors

  • Muhammad Yahya University of Engineering and Technology Peshawar
  • Laiq Hasan University of Engineering and Technology Peshawar
  • Syed Asad Ali CECOS University Peshawar

Keywords:

Bioinformatics Sequence Alignment, High Performance/Throughput Computing, Algorithmic Implementations on GPUs

Abstract

Rapid evolution in sequencing technologies results in generating data on an enormous scale. A focal and main challenge in analyzing data at such a large scale is the alignment of the DNA/Protein sequences, whereby reads are compared to the reference sequences. To find similar sequences, alignment algorithms are used to align a query sequence with the database. Alignment algorithms can be utilized to classify the source of a sequence, to discover similarities among the organisms, or to deduce a progenitor connection. A wide range of algorithms for alignment has been developed in recent years.In this paper, an accurate method of accelerating such algorithms using GPUs has been investigated. A Swiss-Prot database has been processed using GPU implemented Smith-Waterman Sequence Alignment Algorithm. The first step in the process generates the alignment scores but not the actual alignment. Various available alignment tools like ssearch2 are then utilized to align the output file generated during the first step.The performance of GPU-accelerated implementation as compared to other techniques is then evaluated for performance /throughput improvement. Swiss-Prot database was aligned using various alignment tools. NVIDIA TESLA K40 GPU is being utilized for generating the results for this research. This implementation achieves the performance of 44.3 Giga cell updates per second (GCUPS), which is 22.9 times better than its implementation on GTX 275. Performance is improved as the workload of sequences of equal length is equally distributed among all the threads on Multiprocessors of GPU.

Downloads

Download data is not yet available.

Downloads

Published

03-08-2020

How to Cite

Yahya, M., Hasan, L., & Ali, S. A. (2020). High-throughput Protein Sequence Alignment on Multi-core Systems. International Journal of Integrated Engineering, 12(7), 62-71. https://publisher.uthm.edu.my/ojs/index.php/ijie/article/view/6609