Table of Contents
Fetching ...

GeneTEK: Low-power, high-performance and scalable FPGA architecture for exact unit-cost edit distance

Elena Espinosa, Rubén Rodríguez Álvarez, José Miranda, Rafael Larrosa, Miguel Peón-Quirós, Oscar Plata, David Atienza

Abstract

The advent of next-generation sequencing (NGS) has revolutionized genomic research by enabling cost-effective, high-throughput sequencing of a diverse range of organisms. This breakthrough has unleashed a "Cambrian explosion" in genomic data volume and diversity. This volume of workloads places genomics among the top four big data challenges anticipated for this decade. In this context, pairwise sequence alignment represents a very time- and energy-intensive step in common bioinformatics pipelines. Speeding up this step requires the implementation of heuristic approaches, optimized algorithms, and/or hardware acceleration. Although state-of-the-art CPU and GPU implementations have demonstrated significant performance gains, recent FPGA implementations have shown improved energy efficiency. However, the latter often suffer from limited read-length scalability due to constraints on hardware resources when aligning longer sequences. In this work, we present a flexible FPGA-based accelerator template scalable up to 1000 bp that implements Myers's algorithm to compute exact unit-cost edit-distance using high-level synthesis and a worker-based architecture. GeneTEK, a set of instances of this accelerator template in a Xilinx Zynq UltraScale+ FPGA, achieves up to 113% increase in execution speed and up to 111x reduction in energy consumption compared to leading CPU and GPU solutions, while fitting comparison matrices up to 13x larger than previous FPGA-based systolic-array solutions. By following a SW-HW co-design approach, GeneTEK exploits parallelization at multiple levels and efficient memory use to deliver a scalable and accurate FPGA-based accelerator. These results reaffirm the potential of FPGAs as an energy-efficient platform for pairwise alignment of read-lengths up to 1000 bp.

GeneTEK: Low-power, high-performance and scalable FPGA architecture for exact unit-cost edit distance

Abstract

The advent of next-generation sequencing (NGS) has revolutionized genomic research by enabling cost-effective, high-throughput sequencing of a diverse range of organisms. This breakthrough has unleashed a "Cambrian explosion" in genomic data volume and diversity. This volume of workloads places genomics among the top four big data challenges anticipated for this decade. In this context, pairwise sequence alignment represents a very time- and energy-intensive step in common bioinformatics pipelines. Speeding up this step requires the implementation of heuristic approaches, optimized algorithms, and/or hardware acceleration. Although state-of-the-art CPU and GPU implementations have demonstrated significant performance gains, recent FPGA implementations have shown improved energy efficiency. However, the latter often suffer from limited read-length scalability due to constraints on hardware resources when aligning longer sequences. In this work, we present a flexible FPGA-based accelerator template scalable up to 1000 bp that implements Myers's algorithm to compute exact unit-cost edit-distance using high-level synthesis and a worker-based architecture. GeneTEK, a set of instances of this accelerator template in a Xilinx Zynq UltraScale+ FPGA, achieves up to 113% increase in execution speed and up to 111x reduction in energy consumption compared to leading CPU and GPU solutions, while fitting comparison matrices up to 13x larger than previous FPGA-based systolic-array solutions. By following a SW-HW co-design approach, GeneTEK exploits parallelization at multiple levels and efficient memory use to deliver a scalable and accurate FPGA-based accelerator. These results reaffirm the potential of FPGAs as an energy-efficient platform for pairwise alignment of read-lengths up to 1000 bp.

Paper Structure

This paper contains 27 sections, 6 equations, 12 figures, 8 tables, 1 algorithm.

Figures (12)

  • Figure 1: Three types of errors (i.e., edits).
  • Figure 2: Myers's computing matrix (left) and banded algorithm scheme (right) for a band width of 4 nucleotides.
  • Figure 3: Architecture of GeneTEK. To reduce memory accesses, GeneTEK implements a query buffer. Target sequences are read one by one and compared against the sequences in the query buffer, sending query-target pairs to the workers. Once all the targets have been compared with the queries in the buffer, GeneTEK reads a new set of queries into the internal buffer and repeats the process for all the targets.
  • Figure 4: Distribution of read lengths for each simulated FASTQ file.
  • Figure 5: Comparison between theoretical and real performance, measured in giga cells per second (GCUPS), across different datasets using a fixed read length in each dataset (Group A).
  • ...and 7 more figures