Fast and Lightweight Distributed Suffix Array Construction -- First Results
Manuel Haag, Florian Kurpicz, Peter Sanders, Matthias Schimek
TL;DR
The authors address scalable suffix array construction in distributed memory by adapting the DCX/DC3 framework into a space-efficient, bucketing-based algorithm. They introduce a randomized chunk redistribution technique to achieve provable load balancing and reduce in-memory footprint, while maintaining competitive runtime relative to PSAC on large datasets. Preliminary MPI-based evaluations show memory advantages and robust performance on Common Crawl and Wikipedia data, with some trade-offs on DNA data, and they outline extensions to external memory and multi-GPU environments. The work demonstrates a practical path to fast, low-memory distributed suffix sorting with broad applicability to large-scale text processing.
Abstract
We present first algorithmic ideas for a practical and lightweight adaption of the DCX suffix array construction algorithm [Sanders et al., 2003] to the distributed-memory setting. Our approach relies on a bucketing technique which enables a lightweight implementation which uses less than half of the memory required by the currently fastest distributed-memory suffix array algorithm PSAC [Flick and Aluru, 2015] while being competitive or even faster in terms of running time.
