Table of Contents
Fetching ...

A visualization tool to explore alphabet orderings for the Burrows-Wheeler Transform

Lily Major, Dave Davies, Amanda Clare, Jacqueline W. Daykin, Benjamin Mora, Christine Zarges

TL;DR

A graphical user interface (GUI) for working with BWTs is presented, which includes features for searching for matrix row prefixes, skipping over sections in the right-most column (the transform), and displaying BWTs while exploring alphabet orderings with the goal of minimizing the number of runs.

Abstract

The Burrows-Wheeler Transform (BWT) is an efficient invertible text transformation algorithm with the properties of tending to group identical characters together in a run, and enabling search of the text. This transformation has extensive uses particularly in lossless compression algorithms, indexing, and within bioinformatics for sequence alignment tasks. There has been recent interest in minimizing the number of identical character runs ($r$) for a transform and in finding useful alphabet orderings for the sorting step of the matrix associated with the BWT construction. This motivates the inspection of many transforms while developing algorithms. However, the full Burrows-Wheeler matrix is $O(n^2)$ space and therefore very difficult to display and inspect for large input sizes. In this paper we present a graphical user interface (GUI) for working with BWTs, which includes features for searching for matrix row prefixes, skipping over sections in the right-most column (the transform), and displaying BWTs while exploring alphabet orderings with the goal of minimizing the number of runs.

A visualization tool to explore alphabet orderings for the Burrows-Wheeler Transform

TL;DR

A graphical user interface (GUI) for working with BWTs is presented, which includes features for searching for matrix row prefixes, skipping over sections in the right-most column (the transform), and displaying BWTs while exploring alphabet orderings with the goal of minimizing the number of runs.

Abstract

The Burrows-Wheeler Transform (BWT) is an efficient invertible text transformation algorithm with the properties of tending to group identical characters together in a run, and enabling search of the text. This transformation has extensive uses particularly in lossless compression algorithms, indexing, and within bioinformatics for sequence alignment tasks. There has been recent interest in minimizing the number of identical character runs () for a transform and in finding useful alphabet orderings for the sorting step of the matrix associated with the BWT construction. This motivates the inspection of many transforms while developing algorithms. However, the full Burrows-Wheeler matrix is space and therefore very difficult to display and inspect for large input sizes. In this paper we present a graphical user interface (GUI) for working with BWTs, which includes features for searching for matrix row prefixes, skipping over sections in the right-most column (the transform), and displaying BWTs while exploring alphabet orderings with the goal of minimizing the number of runs.
Paper Structure (13 sections, 1 figure)

This paper contains 13 sections, 1 figure.

Figures (1)

  • Figure 1: Three BWTs, all using $aacaacaacbdccccc$ as the input text, and each transformed using different alphabet orders. The left BWM uses ASCII ordering. In the center and right BWMs the alphabet has been re-ordered from ASCII with the goal of reducing the number of runs by joining runs of $a$ characters.