Computing String Covers in Sublinear Time
Jakub Radoszewski, Wiktor Zuba
TL;DR
This work advances the theory of string covers by presenting a sublinear-time, packed-representation approach to compute all covers of a string and to obtain a shortest cover in $O\left(n/\log_\sigma n\right)$ time. It then introduces a sublinear-space data structure that answers Cov$_T[\ell]$ in $O(1)$ time using $O\left(n(\log\sigma + \log\log n)/\log n\right)$ space, supported by an online algorithm that derives the shortest cover from structural properties and IPM queries. Additionally, it characterizes the cover arrays of Fibonacci strings and provides a lower bound in the PILLAR model, proving that no $o\left(n/\log n\right)$-time algorithm can compute the shortest cover or its representations for general inputs in that model. The results collectively push the boundary of sublinear string processing and connect packed representations with practical data-structure design, while outlining fundamental limits in non-standard computation models.
Abstract
Let $T$ be a string of length $n$ over an integer alphabet of size $σ$. In the word RAM model, $T$ can be represented in $O(n /\log_σn)$ space. We show that a representation of all covers of $T$ can be computed in the optimal $O(n/\log_σn)$ time; in particular, the shortest cover can be computed within this time. We also design an $O(n(\logσ+ \log \log n)/\log n)$-sized data structure that computes in $O(1)$ time any element of the so-called (shortest) cover array of $T$, that is, the length of the shortest cover of any given prefix of $T$. As a by-product, we describe the structure of cover arrays of Fibonacci strings. On the negative side, we show that the shortest cover of a length-$n$ string cannot be computed using $o(n/\log n)$ operations in the PILLAR model of Charalampopoulos, Kociumaka, and Wellnitz (FOCS 2020).
