Linear-space LCS enumeration with quadratic-time delay for two strings
Yoshifumi Sakai
TL;DR
This work resolves the linear-space enumeration of all distinct LCSs between two strings by introducing a Hirschberg-based, space-efficient framework that outputs LCS-position sequences in lexicographic order. The core innovation combines a variant of Hirschberg’s LCS finder (firstLCS) with a branching routine (findBranch) to navigate the set of LCSs without constructing the full all-LCS DAG, thereby achieving $O(n^2)$ time per LCS while using $O(L)$ space. By enforcing a specified midpoint strategy and leveraging leftmost representatives of LCS-position sequences, the algorithm attains quadratic-time delay with linear space, improving upon previous $O(n^2 \log L)$ delay bounds and enabling efficient enumeration on long strings. The approach is particularly relevant for pattern discovery and comparative sequence analysis where memory constraints are critical.
Abstract
Suppose we want to seek the longest common subsequences (LCSs) of two strings as informative patterns that explain the relationship between the strings. The dynamic programming algorithm gives us a table from which all LCSs can be extracted by traceback. However, the need for quadratic space to hold this table can be an obstacle when dealing with long strings. A question that naturally arises in this situation would be whether it is possible to exhaustively search for all LCSs one by one in a time-efficient manner using only a space linear in the LCS length, where we treat read-only memory for storing the strings as excluded from the space consumed. As a part of the answer to this question, we propose an $O(L)$-space algorithm that outputs all distinct LCSs of the strings one by one each in $O(n^2)$ time, where the strings are both of length $n$ and $L$ is the LCS length of the strings.
