Table of Contents
Fetching ...

Edit and Alphabet-Ordering Sensitivity of Lex-parse

Yuto Nakashima, Dominik Köppl, Mitsuru Funakoshi, Shunsuke Inenaga, Hideo Bannai

TL;DR

This work analyzes the sensitivity of lex-parse to two perturbations: single-character edits and alphabet-ordering changes. It develops tight upper and lower bounds by leveraging Fibonacci words and Lyndon factorizations, establishing that both edit-sensitivity and alphabet-ordering sensitivity of lex-parse scale as $\Theta(\log n)$. The results connect lex-parse behavior to bidirectional macro schemes and deepen understanding of dictionary compressors and repetitiveness measures. Overall, the findings reveal fundamental limits on how input modifications and alphabet ordering influence lex-parse structure and size.

Abstract

We investigate the compression sensitivity [Akagi et al., 2023] of lex-parse [Navarro et al., 2021] for two operations: (1) single character edit and (2) modification of the alphabet ordering, and give tight upper and lower bounds for both operations. For both lower bounds, we use the family of Fibonacci words. For the bounds on edit operations, our analysis makes heavy use of properties of the Lyndon factorization of Fibonacci words to characterize the structure of lex-parse.

Edit and Alphabet-Ordering Sensitivity of Lex-parse

TL;DR

This work analyzes the sensitivity of lex-parse to two perturbations: single-character edits and alphabet-ordering changes. It develops tight upper and lower bounds by leveraging Fibonacci words and Lyndon factorizations, establishing that both edit-sensitivity and alphabet-ordering sensitivity of lex-parse scale as . The results connect lex-parse behavior to bidirectional macro schemes and deepen understanding of dictionary compressors and repetitiveness measures. Overall, the findings reveal fundamental limits on how input modifications and alphabet ordering influence lex-parse structure and size.

Abstract

We investigate the compression sensitivity [Akagi et al., 2023] of lex-parse [Navarro et al., 2021] for two operations: (1) single character edit and (2) modification of the alphabet ordering, and give tight upper and lower bounds for both operations. For both lower bounds, we use the family of Fibonacci words. For the bounds on edit operations, our analysis makes heavy use of properties of the Lyndon factorization of Fibonacci words to characterize the structure of lex-parse.
Paper Structure (7 sections, 24 theorems, 22 equations, 4 figures)

This paper contains 7 sections, 24 theorems, 22 equations, 4 figures.

Key Result

Lemma 1

$w$ is primitive iff $w$ occurs exactly twice in $w^2$.

Figures (4)

  • Figure 1: Illustration of the characterization of the edited string $T_{2k}$ by the Lyndon factorization of $F"_{2k}$ (when $k=5$).
  • Figure 2: Illustration of $T_{2k}$ for proof of Lemma \ref{['lem:maxsuf']}.
  • Figure 3: Illustration of the proof of Lemma \ref{['lemFourFactorsEvenN']} for $k$ even. If $k$ is odd, the blocks $ab$ and $ba$ are swapped (this gives the setting Lemma \ref{['lemFourFactorsOddN']}).
  • Figure 4: Illustration of the proof of Lemma \ref{['lem:ao-critical-suffix']}.

Theorems & Definitions (38)

  • Lemma 1: lothaire2005applied
  • Lemma 2: Useful properties on Fibonacci word (cf. NavarroOP21)
  • Lemma 3
  • proof
  • Theorem 1
  • proof
  • Theorem 2
  • Theorem 3
  • Lemma 4: MELANCON2000137
  • Lemma 5
  • ...and 28 more