In search of maximum non-overlapping codes
Lidija Stanovnik, Miha Moškon, Miha Mraz
TL;DR
This work tackles the problem of maximum non-overlapping (cross-bifix-free) codes by formulating SQN$(q,n)$, an integer optimization that selects partition sizes to maximize code size within the partition-based framework $\\mathcal{M}_{q,n}$ proposed by Fimmel et al. It proves that all maximal non-overlapping codes reside in $\\mathcal{M}_{q,n}$ and provides a concrete size formula $|C|=\sum_{i=1}^{n-1} |L_i||R_{n-i}|$, along with necessary and sufficient maximality conditions. The authors derive exact results for four-letter codes, e.g., $S(q,4)=\left[\frac{3q}{4}\right]^3 \left(q-\left[\frac{3q}{4}\right]\right)$ and $N(q,4)=2\binom{q}{\left[\frac{3q}{4}\right]} q$, and show through extensive computations that existing constructions are often not optimal. They also present reductions to SQN to prune search space, compare outcomes with prior constructions, and offer conjectures about the structure of maximal codes, supported by public code and data.
Abstract
Non-overlapping codes are block codes that have arisen in diverse contexts of computer science and biology. Applications typically require finding non-overlapping codes with large cardinalities, but the maximum size of non-overlapping codes has been determined only for cases where the codeword length divides the size of the alphabet, and for codes with codewords of length two or three. For all other alphabet sizes and codeword lengths no computationally feasible way to identify non-overlapping codes that attain the maximum size has been found to date. Herein we characterize maximal non-overlapping codes. We formulate the maximum non-overlapping code problem as an integer optimization problem and determine necessary conditions for optimality of a non-overlapping code. Moreover, we solve several instances of the optimization problem to show that the hitherto known constructions do not generate the optimal codes for many alphabet sizes and codeword lengths. We also evaluate the number of distinct maximum non-overlapping codes.
