Btrim: A fast, lightweight adapter and quality trimming program for next-generation sequencing technologies
Yong Kong
TL;DR
Adapter contamination and low-quality tails in next-generation sequencing reads can confound mapping and de novo assembly, and reliable barcode demultiplexing is essential for multiplexed samples. The paper presents Btrim, a fast, lightweight stand-alone tool that trims adapters and low-quality regions using a modified Myers's bit-vector dynamic programming algorithm to tolerate indels in adapters and barcodes, complemented by a moving-window quality trim. Its key contributions include exact boundary determination for 5'- and 3'-adapters, backward search to obtain start positions, and rich per-read trimming records, all implemented in C with low memory usage and FASTQ support. Together these features enable Btrim to serve as the initial step in diverse NGS pipelines, improving downstream mapping and assembly performance.
Abstract
Btrim is a fast and lightweight software to trim adapters and low quality regions in reads from ultra high-throughput next-generation sequencing machines. It also can reliably identify barcodes and assign the reads to the original samples. Based on a modified Myers's bit-vector dynamic programming algorithm, Btrim can handle indels in adapters and barcodes. It removes low quality regions and trims off adapters at both or either end of the reads. A typical trimming of 30M reads with two sets of adapter pairs can be done in about a minute with a small memory footprint. Btrim is a versatile stand-alone tool that can be used as the first step in virtually all next-generation sequence analysis pipelines. The program is available at \url{http://graphics.med.yale.edu/trim/}.
