Searching 2D-Strings for Matching Frames

Itai Boneh; Dvir Fried; Shay Golan; Matan Kraus; Adrian Miclaus; Arseny Shur

Searching 2D-Strings for Matching Frames

Itai Boneh, Dvir Fried, Shay Golan, Matan Kraus, Adrian Miclaus, Arseny Shur

TL;DR

The paper defines and studies matching frames in 2D strings, aiming to maximize the frame perimeter. It develops a heavy-light approach that splits frames into short and tall regimes, supported by the Segment Compatibility Data Structure and a suite of suffix-array, LCP, and orthogonal-range primitives to enable efficient frame detection. The authors achieve a spectrum of results: an exact $\tilde{O}(n^{2.5})$ time algorithm for square matrices and $\tilde{O}(ab\min\{a,\sqrt{b}\})$ in the general case, a near-linear $(1-\varepsilon)$-approximation in $\tilde{O}(nm/\varepsilon^{4})$, and a decision variant in $\tilde{O}(nm)$. These contributions introduce novel data-structural techniques and structural insights that advance 2D string pattern searching and have potential implications for tiling and 2D repetition problems.

Abstract

We introduce the natural notion of a matching frame in a $2$-dimensional string. A matching frame in a $2$-dimensional $n\times m$ string $M$, is a rectangle such that the strings written on the horizontal sides of the rectangle are identical, and so are the strings written on the vertical sides of the rectangle. Formally, a matching frame in $M$ is a tuple $(u,d,\ell,r)$ such that $M[u][\ell ..r] = M[d][\ell ..r]$ and $M[u..d][\ell] = M[u..d][r]$. In this paper, we present an algorithm for finding the maximum perimeter matching frame in a matrix $M$ in $\tilde{O}(n^{2.5})$ time (assuming $n \ge m)$. Additionally, for every constant $ε> 0$ we present a near-linear $(1-ε)$-approximation algorithm for the maximum perimeter of a matching frame. In the development of the aforementioned algorithms, we introduce inventive technical elements and uncover distinctive structural properties that we believe will captivate the curiosity of the community.

Searching 2D-Strings for Matching Frames

TL;DR

time algorithm for square matrices and

in the general case, a near-linear

-approximation in

, and a decision variant in

. These contributions introduce novel data-structural techniques and structural insights that advance 2D string pattern searching and have potential implications for tiling and 2D repetition problems.

Abstract

We introduce the natural notion of a matching frame in a

-dimensional string. A matching frame in a

-dimensional

string

, is a rectangle such that the strings written on the horizontal sides of the rectangle are identical, and so are the strings written on the vertical sides of the rectangle. Formally, a matching frame in

is a tuple

such that

and

. In this paper, we present an algorithm for finding the maximum perimeter matching frame in a matrix

time (assuming

. Additionally, for every constant

we present a near-linear

-approximation algorithm for the maximum perimeter of a matching frame. In the development of the aforementioned algorithms, we introduce inventive technical elements and uncover distinctive structural properties that we believe will captivate the curiosity of the community.

Paper Structure (15 sections, 19 theorems, 4 figures)

This paper contains 15 sections, 19 theorems, 4 figures.

Introduction
High-Level Overview
Preliminaries
Suffix Arrays, Longest Common Prefixes
Orthogonal Range Queries
Data Structures
The Segment Compatibility Data Structure
Maximum Matching Frame
Algorithm for Short Frames
Algorithm for Tall Frames
Combining the Short and Tall Algorithms
Approximation Version
Interesting Pairs and Interesting Triplets
Finding all interesting triplets
Algorithm for the Decision Variant

Key Result

Theorem 1

The time complexity of the maximum matching frame problem for an $n\times m$ matrix $M$ is $\tilde{O}(n^{2.5})$ in the case $m=\Theta(n)$. In the general case, the complexity is $\tilde{O}(ab\min\{a,\sqrt{b}\})$, where $a=\min\{n,m\}$ and $b=\max\{n,m\}$.Throughout the paper, $\tilde{O}(f(n)) = O(f(

Figures (4)

Figure 1: An example of a matching frame $(u,d,\ell,r)=(2,6,3,9)$. The strings on the top and bottom sides of the frame are equal, and the strings on the left and right sides are also equal. The perimeter of the frame is $2\cdot(6-2+9-3)=20$. The matrix also contains a smaller matching frame.
Figure 2: (a) An example of $\mathsf{LSA}^\ell_{\mathsf{rows}}$. Every cell in $\mathsf{LSA}^\ell_\mathsf{rows}$ contains an index corresponding to a horizontal word in the matrix starting in column $\ell$. The (indices representing the) words appear bottom-up in lex-order. (b) A visualization of the points stored in $D^\ell_\mathsf{rows}$. Every point corresponds to a horizontal word. The height of every point corresponds to the location of the corresponding word in $\mathsf{LSA}^\ell_\mathsf{rows}$. The horizontal location of a point represents the index of its appearance in the string.
Figure 3: An example of horizontal and vertical pairs of aligned segments. Every pair of monochromatic lines is an aligned pair of segments. The red, green, blue, and purple pairs are horizontal. The yellow pair of vertical segments is compatible with the red and with the green pair.
Figure 4: An example of interesting pairs where the first component of the pair is $S_1$ or $S_4$. The rows beginning in red form interesting pairs with $S_1$ and the rows beginning in blue form interesting pairs with $S_4$. The color indicates the $\mathsf{LCP}$ of the components of the pair. Notice that $(S_1,S_8)$ is not an interesting pair because of $S_6$.

Theorems & Definitions (23)

Theorem 1: Maximum Matching Frame
Theorem 2: ($1-\varepsilon$)-Approximation
Corollary 3: Deciding Matching Frame
Lemma 4
Lemma 5
Definition 8: Fingerprint
Lemma 9
Lemma 10
Lemma 11
Lemma 12
...and 13 more

Searching 2D-Strings for Matching Frames

TL;DR

Abstract

Searching 2D-Strings for Matching Frames

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (4)

Theorems & Definitions (23)