Cartesian Forest Matching
Bastien Auvray, Julien David, Richard Groult, Thierry Lecroq
TL;DR
This work generalizes Cartesian Trees to Cartesian Forests to robustly handle equal values in sequence pattern matching. It shows that exact and approximate Cartesian Tree Matching techniques can be ported to Cartesian Forest Matching, achieving $O(n)$ space with worst-case $O(mn)$ time for exact matching and average-case $O(n)$, and enabling one-difference, one-swap, or single-edit approximations with comparable complexity. The authors develop Forest-analogues of Cartesian Tree representations (Forest Parent-Distance and Forest Skip-Ped-Number) and prove one-to-one correspondences between Cartesian Forests, Schröder Trees, and Parentheses Words, supported by generating-function analysis yielding Schröder-Hipparchus counts. A signature and a $\tau$-Filter accelerate matching in a Rabin-Karp framework, and experiments demonstrate practical improvements across various entropy regimes. Overall, the paper provides a cohesive theory and practical framework for efficient pattern matching on sequences with ties, bridging combinatorics and algorithms through Cartesian Forests and their rich connections to classical structures.
Abstract
In this paper, we introduce the notion of Cartesian Forest, which generalizes Cartesian Trees, in order to deal with partially ordered sequences. We show that algorithms that solve both exact and approximate Cartesian Tree Matching can be adapted to solve Cartesian Forest Matching in average linear time. We adapt the notion of Cartesian Tree Signature to Cartesian Forests and show how filters can be used to experimentally improve the algorithm for the exact matching. We also show a one to one correspondence between Cartesian Forests and Schröder Trees.
