Online and Offline Algorithms for Counting Distinct Closed Factors via Sliding Suffix Trees
Takuya Mieno, Shun Takahashi, Kazuhisa Seto, Takashi Horiyama
TL;DR
This work addresses counting distinct closed factors in a string via sliding suffix trees, presenting both online and offline approaches. The online method achieves $O(n\log\sigma)$ time using $O(n)$ space by leveraging Ukkonen's suffix tree and sliding-window structures, while the offline method attains $O(n)$ time and space for linearly sortable alphabets by simulating sliding trees on a static suffix tree with WAQ. A border-based characterization using $t_j=\mathrm{lrs}(T[1..j])$ and $z_j=\mathrm{lrs}^2(T[1..j])$ underpins the counting, enabling linear-time offline processing and a pathway to enumeration via geometric range data structures. The paper also explores enumeration trade-offs, showing subquadratic enumeration under certain conditions but leaving an open question whether $O(n\mathrm{polylog}(n) + \mathrm{output})$ time can be achieved, i.e., linear in the output size up to polylog factors. These results advance efficient analysis of repetitive structures in strings and contribute techniques for sliding-window string processing.
Abstract
A string is said to be closed if its length is one, or if it has a non-empty factor that occurs both as a prefix and as a suffix of the string, but does not occur elsewhere. The notion of closed words was introduced by [Fici, WORDS 2011]. Recently, the maximum number of distinct closed factors occurring in a string was investigated by [Parshina and Puzynina, Theor. Comput. Sci. 2024], and an asymptotic tight bound was proved. In this paper, we propose two algorithms to count the distinct closed factors in a string T of length n over an alphabet of size σ. The first algorithm runs in O(n log σ) time using O(n) space for string T given in an online manner. The second algorithm runs in O(n) time using O(n) space for string T given in an offline manner. Both algorithms utilize suffix trees for sliding windows.
