Shuffle squares and ordered nest-free graphs
Jarosław Grytczuk, Bartłomiej Pawlik, Andrzej Ruciński
TL;DR
The paper introduces a graphic framework that represents shuffle squares of binary words as nest-free ordered graphs, via canonical twins and a constructive contraction to a graph $G_{X,Y}$. This approach yields precise criteria for when a word is a shuffle square, including complete classifications for words with up to five runs and for words with separated ones, and leads to a major result (the odd ABBA problem) showing that $(1001)^n$ with $n$ odd is not a shuffle square. A central contribution is a general deletion-distance framework, with a detailed construction showing that the word family $O_{m,r}$ has deficit at most 23, supporting broader conjectures on how close arbitrary words are to shuffle squares. The work culminates in open problems on doubly binary words, distance measures to shuffle squares (and reverse variants), and potential extensions to higher alphabets, suggesting the framework's broad applicability in combinatorics on words.
Abstract
A shuffle square is a word consisting of two shuffled copies of the same word. For instance, the Turkish word $\mathtt{\color{red}{ik}\color{blue}{i}\color{red}{li}\color{blue}{kli}}$ (binary in English) is a shuffle square, as it can be split into two copies of the word $\mathtt{ikli}$. We explore a representation of shuffle squares in terms of \emph{ordered nest-free graphs} and demonstrate the usefulness of this approach by applying it to several families of binary words. Among others, we characterize shuffle squares with four and five runs, as well as shuffle squares with all $\mathtt1$-runs of length one (and with the $\mathtt1$'s alternating between the two copies). In our main result we provide quite general sufficient conditions for a binary word not to be a shuffle square. In particular, it follows that binary words of the type $(\mathtt{1001})^n$, $n$ odd, are not shuffle squares. We complement it by showing that all other words whose every $\mathtt{1}$-run has length one or two, while every $\mathtt{0}$-run has length two, are shuffle squares. We also provide a counterexample to a believable stipulation that binary words of the form $\mathtt1^{m}\mathtt0^{m-2}\mathtt1^{m-4}\cdots$, $m$ odd, are far from being shuffle squares (the distance measured by the minimum number of letters one has to delete in order to turn a word into a shuffle square).
