GateKeeper-GPU: Fast and Accurate Pre-Alignment Filtering in Short Read Mapping
Zülal Bingöl, Mohammed Alser, Onur Mutlu, Ozcan Ozturk, Can Alkan
TL;DR
This work tackles the computational bottleneck in short read mapping caused by the expensive verification step, which traditionally relies on quadratic-time dynamic programming. It introduces GateKeeper-GPU, a CUDA-based pre-alignment filter that improves filtering accuracy over prior GateKeeper implementations and exploits massive GPU parallelism to rapidly assess many read-reference pairs. The authors integrate GateKeeper-GPU with mrFAST, conduct thorough evaluations of accuracy, throughput, and resource use across real and simulated data, and demonstrate substantial end-to-end speedups (up to $1.4\times$) and verification-time reductions (up to $2.9\times$). The results indicate GateKeeper-GPU is a practical, scalable enhancement for read mapping pipelines, with two encoding modes and clear guidance on performance trade-offs and future optimizations.
Abstract
At the last step of short read mapping, the candidate locations of the reads on the reference genome are verified to compute their differences from the corresponding reference segments using sequence alignment algorithms. Calculating the similarities and differences between two sequences is still computationally expensive since approximate string matching techniques traditionally inherit dynamic programming algorithms with quadratic time and space complexity. We introduce GateKeeper-GPU, a fast and accurate pre-alignment filter that efficiently reduces the need for expensive sequence alignment. GateKeeper-GPU provides two main contributions: first, improving the filtering accuracy of GateKeeper (a lightweight pre-alignment filter), and second, exploiting the massive parallelism provided by the large number of GPU threads of modern GPUs to examine numerous sequence pairs rapidly and concurrently. By reducing the work, GateKeeper-GPU provides an acceleration of 2.9x to sequence alignment and up to 1.4x speedup to the end-to-end execution time of a comprehensive read mapper (mrFAST). GateKeeper-GPU is available at https://github.com/BilkentCompGen/GateKeeper-GPU.
