Table of Contents
Fetching ...

Searching 2D-Strings for Matching Frames

Itai Boneh, Dvir Fried, Shay Golan, Matan Kraus, Adrian Miclaus, Arseny Shur

TL;DR

The paper defines and studies matching frames in 2D strings, aiming to maximize the frame perimeter. It develops a heavy-light approach that splits frames into short and tall regimes, supported by the Segment Compatibility Data Structure and a suite of suffix-array, LCP, and orthogonal-range primitives to enable efficient frame detection. The authors achieve a spectrum of results: an exact $\tilde{O}(n^{2.5})$ time algorithm for square matrices and $\tilde{O}(ab\min\{a,\sqrt{b}\})$ in the general case, a near-linear $(1-\varepsilon)$-approximation in $\tilde{O}(nm/\varepsilon^{4})$, and a decision variant in $\tilde{O}(nm)$. These contributions introduce novel data-structural techniques and structural insights that advance 2D string pattern searching and have potential implications for tiling and 2D repetition problems.

Abstract

We introduce the natural notion of a matching frame in a $2$-dimensional string. A matching frame in a $2$-dimensional $n\times m$ string $M$, is a rectangle such that the strings written on the horizontal sides of the rectangle are identical, and so are the strings written on the vertical sides of the rectangle. Formally, a matching frame in $M$ is a tuple $(u,d,\ell,r)$ such that $M[u][\ell ..r] = M[d][\ell ..r]$ and $M[u..d][\ell] = M[u..d][r]$. In this paper, we present an algorithm for finding the maximum perimeter matching frame in a matrix $M$ in $\tilde{O}(n^{2.5})$ time (assuming $n \ge m)$. Additionally, for every constant $ε> 0$ we present a near-linear $(1-ε)$-approximation algorithm for the maximum perimeter of a matching frame. In the development of the aforementioned algorithms, we introduce inventive technical elements and uncover distinctive structural properties that we believe will captivate the curiosity of the community.

Searching 2D-Strings for Matching Frames

TL;DR

The paper defines and studies matching frames in 2D strings, aiming to maximize the frame perimeter. It develops a heavy-light approach that splits frames into short and tall regimes, supported by the Segment Compatibility Data Structure and a suite of suffix-array, LCP, and orthogonal-range primitives to enable efficient frame detection. The authors achieve a spectrum of results: an exact time algorithm for square matrices and in the general case, a near-linear -approximation in , and a decision variant in . These contributions introduce novel data-structural techniques and structural insights that advance 2D string pattern searching and have potential implications for tiling and 2D repetition problems.

Abstract

We introduce the natural notion of a matching frame in a -dimensional string. A matching frame in a -dimensional string , is a rectangle such that the strings written on the horizontal sides of the rectangle are identical, and so are the strings written on the vertical sides of the rectangle. Formally, a matching frame in is a tuple such that and . In this paper, we present an algorithm for finding the maximum perimeter matching frame in a matrix in time (assuming . Additionally, for every constant we present a near-linear -approximation algorithm for the maximum perimeter of a matching frame. In the development of the aforementioned algorithms, we introduce inventive technical elements and uncover distinctive structural properties that we believe will captivate the curiosity of the community.
Paper Structure (15 sections, 19 theorems, 4 figures)

This paper contains 15 sections, 19 theorems, 4 figures.

Key Result

Theorem 1

The time complexity of the maximum matching frame problem for an $n\times m$ matrix $M$ is $\tilde{O}(n^{2.5})$ in the case $m=\Theta(n)$. In the general case, the complexity is $\tilde{O}(ab\min\{a,\sqrt{b}\})$, where $a=\min\{n,m\}$ and $b=\max\{n,m\}$.Throughout the paper, $\tilde{O}(f(n)) = O(f(

Figures (4)

  • Figure 1: An example of a matching frame $(u,d,\ell,r)=(2,6,3,9)$. The strings on the top and bottom sides of the frame are equal, and the strings on the left and right sides are also equal. The perimeter of the frame is $2\cdot(6-2+9-3)=20$. The matrix also contains a smaller matching frame.
  • Figure 2: (a) An example of $\mathsf{LSA}^\ell_{\mathsf{rows}}$. Every cell in $\mathsf{LSA}^\ell_\mathsf{rows}$ contains an index corresponding to a horizontal word in the matrix starting in column $\ell$. The (indices representing the) words appear bottom-up in lex-order. (b) A visualization of the points stored in $D^\ell_\mathsf{rows}$. Every point corresponds to a horizontal word. The height of every point corresponds to the location of the corresponding word in $\mathsf{LSA}^\ell_\mathsf{rows}$. The horizontal location of a point represents the index of its appearance in the string.
  • Figure 3: An example of horizontal and vertical pairs of aligned segments. Every pair of monochromatic lines is an aligned pair of segments. The red, green, blue, and purple pairs are horizontal. The yellow pair of vertical segments is compatible with the red and with the green pair.
  • Figure 4: An example of interesting pairs where the first component of the pair is $S_1$ or $S_4$. The rows beginning in red form interesting pairs with $S_1$ and the rows beginning in blue form interesting pairs with $S_4$. The color indicates the $\mathsf{LCP}$ of the components of the pair. Notice that $(S_1,S_8)$ is not an interesting pair because of $S_6$.

Theorems & Definitions (23)

  • Theorem 1: Maximum Matching Frame
  • Theorem 2: ($1-\varepsilon$)-Approximation
  • Corollary 3: Deciding Matching Frame
  • Lemma 4
  • Lemma 5
  • Definition 8: Fingerprint
  • Lemma 9
  • Lemma 10
  • Lemma 11
  • Lemma 12
  • ...and 13 more