Table of Contents
Fetching ...

Characterization of Isometric Words based on Swap and Mismatch Distance

M. Anselmo, G. Castiglione, M. Flores, D. Giammarresi, M. Madonia, S. Mantaci

TL;DR

This work extends the classical notion of isometric words from the Hamming distance to a swap-and-mismatch edit distance, termed tilde-distance ${\rm dist}_{\sim}$. It defines tilde-isometric words and introduces tilde-witnesses to certify non-isometricity, revealing that tilde-isometricity hinges on intricate overlap configurations between a word and its prefix-suffix overlaps. The central result provides a complete characterization: a word $f$ is tilde-non-isometric if and only if it exhibits specific 1-tilde-error or 2-tilde-error overlap patterns (C0–C5), up to symmetry operations. The proofs combine structural lemmas on tilde-witnesses with constructive techniques to demonstrate both necessity and sufficiency, laying groundwork for tilde-hypercube and generalized tilde-Fibonacci cube explorations. These findings deepen understanding of string-edit dynamics with swaps and offer potential algorithmic tools for tilde-distance related problems in combinatorics on words and graph-based string models.

Abstract

In this paper we consider an edit distance with swap and mismatch operations, called tilde-distance, and introduce the corresponding definition of tilde-isometric word. Isometric words are classically defined with respect to Hamming distance and combine the notion of edit distance with the property that a word does not appear as factor in other words. A word f is said tilde-isometric if, for any pair of f-free words u and v, there exists a transformation from u to v via the related edit operations such that all the intermediate words are also f -free. This new setting is here studied giving a full characterization of the tilde-isometric words in terms of overlaps with errors.

Characterization of Isometric Words based on Swap and Mismatch Distance

TL;DR

This work extends the classical notion of isometric words from the Hamming distance to a swap-and-mismatch edit distance, termed tilde-distance . It defines tilde-isometric words and introduces tilde-witnesses to certify non-isometricity, revealing that tilde-isometricity hinges on intricate overlap configurations between a word and its prefix-suffix overlaps. The central result provides a complete characterization: a word is tilde-non-isometric if and only if it exhibits specific 1-tilde-error or 2-tilde-error overlap patterns (C0–C5), up to symmetry operations. The proofs combine structural lemmas on tilde-witnesses with constructive techniques to demonstrate both necessity and sufficiency, laying groundwork for tilde-hypercube and generalized tilde-Fibonacci cube explorations. These findings deepen understanding of string-edit dynamics with swaps and offer potential algorithmic tools for tilde-distance related problems in combinatorics on words and graph-based string models.

Abstract

In this paper we consider an edit distance with swap and mismatch operations, called tilde-distance, and introduce the corresponding definition of tilde-isometric word. Isometric words are classically defined with respect to Hamming distance and combine the notion of edit distance with the property that a word does not appear as factor in other words. A word f is said tilde-isometric if, for any pair of f-free words u and v, there exists a transformation from u to v via the related edit operations such that all the intermediate words are also f -free. This new setting is here studied giving a full characterization of the tilde-isometric words in terms of overlaps with errors.
Paper Structure (13 sections, 12 theorems, 11 equations, 3 figures)

This paper contains 13 sections, 12 theorems, 11 equations, 3 figures.

Key Result

Proposition 1

A word $f$ is not Ham-isometric if and only if $f$ has a 2-error overlap.

Figures (3)

  • Figure 1: A word $f$ and its $2$-tilde-error overlap of shift $r$ and length $\ell=n-r$, with tilde-transformation $(O_i, O_j)=(R_i,R_j)$ (left), and $(O_i, O_j)=(S_i,R_j)$ (right)
  • Figure 2: The representation of the three occurrences $f^1$, $f^2$, and $f^3$ of $f$ in $O^1(u)$, $O^2(u)$ and $O^3(u)$, respectively, when $u[s..s+2]=101$.
  • Figure 3: The representation of the three occurrences $f^1$, $f^2$, and $f^3$ of $f$ in $O^1(u)$, $O^2(u)$ and $O^3(u)$, respectively, when $u[s..s+2]=100$.

Theorems & Definitions (32)

  • Proposition 1
  • Definition 2
  • Definition 3
  • Definition 4
  • Example 5
  • Example 6
  • Remark 7
  • Definition 8
  • Definition 9
  • Example 10
  • ...and 22 more