Bancroft: Genomics Acceleration Beyond On-Device Memory
Se-Min Lim, Seongyoung Kang, Sang-Woo Jun
TL;DR
Bancroft tackles the memory bottleneck in data-heavy genomics by delivering practically unlimited on-device memory through a hardware-accelerated, reference-based compression pipeline that runs with a software manager and a familiar programming interface. The core innovations are fixed-k k-mer matching, fixed-stride alignment, grouped headers, and cuckoo-hash match discovery, enabling high compression and fast decompression on FPGA hardware. The prototype on an Alveo U50 with $8$ GB HBM demonstrates TB-scale data access at about $30\%$ of the HBM bandwidth and compression throughput up to $3.7$ GB/s, outperforming prior FPGA and many GPU baselines. This approach decouples data scalability from memory capacity growth, enabling scalable genomic analytics on commodity hardware.
Abstract
This paper presents Bancroft, a computational genomics acceleration platform that provides the illusion of practically infinite on-device memory capacity by compressing genomic data movement over PCIe. Bancroft introduces novel optimizations for efficient accelerator implementation to reference-based genome compression, including fixed-stride matching using cuckoo hashes and grouped header encoding, incorporated into a familiar interface supporting random accesses. We evaluate a prototype implementation of Bancroft on an affordable Alveo U50 FPGA equipped with 8 GB of HBM. Thanks to the orders of magnitude improvements in performance and resource efficiency of genomic compression, our prototype provides access to TBs of host-side genomic data at memory-class performance, measuring speeds over 30% of the on-device HBM bandwidth, an order of magnitude higher than conventional PCIe-limited architectures. Using a real-world pre-alignment filtering application, Bancroft demonstrates over 6x improvement over the conventional PCIe-attached architecture, achieving 30% of peak internal throughput of an accelerator with HBM, and 90% of the one with DDR4. Bancroft supports memory-class performance to practically infinite data capacity, using a small, fixed amount of HBM, making it an attractive solution to continued future scalability of computational genomics.
