Inverting Parameterized Burrows-Wheeler Transform
Shogen Kawanami, Kento Iseri, Tomohiro I
TL;DR
This paper proves that the parameterized Burrows-Wheeler Transform (pBWT) is invertible: from the pBWT $\mathsf{L}$ of a $p$-string $\mathsf{T}$ of length $n$, one can recover $\mathsf{T}$ up to renaming of parameter symbols in $O(n^2)$ time using $O(n)$ space. The authors formalize parameterized strings via s- and p-symbols, introduce prev-encoding to capture $p$-matching, and define the pBWT together with the LF-mapping. They present a constructive inversion algorithm: a first $O(n^3)$-time, $O(n^2)$-space approach using prefix encodings, then an optimized $O(n^2)$-time, $O(n)$-space method that iteratively refines rank arrays to recover the LF-mapping, and finally reconstructs a matching string in subquadratic time given LF. The work lays the groundwork for using pBWTs as compact indices for parameterized strings and discusses extensions to prev$_\infty$-encoding and future directions toward subquadratic inversion, with implications for p-matching data structures and reverse-engineering string indexes.
Abstract
The Burrows-Wheeler Transform (BWT) of a string is an invertible permutation of the string, which can be used for data compression and compact indexes for string pattern matching. Ganguly et al. [SODA, 2017] introduced the parameterized BWT (pBWT) to design compact indexes for parameterized matching (p-matching), a variant of string pattern matching with parameter symbols introduced by Baker [STOC, 1993]. Although the pBWT was inspired by the BWT, it is not obvious whether the pBWT itself is invertible or not. In this paper we show that we can retrieve the original string (up to renaming of parameter symbols) from the pBWT of length $n$ in $O(n^2)$ time and $O(n)$ space.
