Lexicographic transductions of finite words
Emmanuel Filiot, Pierre-Alain Reynier, Nathan Lhote
TL;DR
This work defines Lex, a class of transductions over finite words with exponential growth, and provides three distinct characterizations: closure under a lexicographic maplex operator from simple recognizables, a syntactic MSOSI fragment with automata-defined regular languages, and an automaton model via nested marble transducers (NMT). The authors prove Lex subsumes both SST and polyregular transductions, is regularity-preserving, and is closed under post-composition with PolyReg and pre-composition with rational transductions; they also establish an equivalence between Lex and NMT, and analyze the broader MSOSI framework, highlighting a key open question about regularity preservation. The work advances a unified view of regular transductions with exponential growth and provides a concrete automata-theoretic model (NMT) that achieves desirable closure and expressiveness properties. It also outlines important open problems and potential extensions that guide future exploration of MSOSI and exponential-growth transductions.
Abstract
Regular transductions over finite words have linear input-to-output growth. This class of transductions enjoys many characterizations. Recently, regular transductions have been extended by Bojańczyk to polyregular transductions, which have polynomial growth, and are characterized by pebble transducers and MSO interpretations. Another class of interest is that of transductions defined by streaming string transducers or marble transducers, which have exponential growth and are incomparable with polyregular transductions. In this paper, we consider MSO set interpretations (MSOSI) over finite words which were introduced by Colcombet and Loeding. MSOSI are a natural candidate for the class of "regular transductions with exponential growth", and are rather well-behaved. However MSOSI lack, for now, two desirable properties that regular and polyregular transductions have. The first property is being described by an automaton model, which is closely related to the second property of regularity preserving meaning preserving regular languages under inverse image. We first show that if MSOSI are (effectively) regularity preserving then any automatic $ω$-word has a decidable MSO theory, an almost 20 years old conjecture of Bárány. Our main contribution is the introduction of a class of transductions of exponential growth, which we call lexicographic transductions. We provide three different presentations for this class: 1) as the closure of simple transductions (recognizable transductions) under a single operator called maplex; 2) as a syntactic fragment of MSOSI (but the regular languages are given by automata instead of formulas); 3) we give an automaton based model called nested marble transducers, which generalize both marble transducers and pebble transducers. We show that this class enjoys many nice properties including being regularity preserving.
