Languages of Words of Low Automatic Complexity Are Hard to Compute

Joey Chen; Bjørn Kjos-Hanssen; Ivan Koswara; Linus Richter; Frank Stephan

Languages of Words of Low Automatic Complexity Are Hard to Compute

Joey Chen, Bjørn Kjos-Hanssen, Ivan Koswara, Linus Richter, Frank Stephan

TL;DR

This work introduces and analyzes the low-complexity languages $L_q=igl\{ x: A_{Ne}(x)<q|x|\bigr\}$ for $q in(0,1/2)$, where $A_{Ne}$ is the non-deterministic automatic complexity defined via NFAs that exactly accept a word. It proves that each $L_q$ is not context-free and not recognizable by several constant-depth circuit families (including $ extbf{SAC}^0$ and $igl extoplus extbf{SAC}^0$ on prime alphabets), and it establishes the Shannon effect for $A_{Ne}$. The paper also answers a question of Kjos-Hanssen by showing $L_{1/3}$ is not $igl extoplus extbf{SAC}^0$-recognisable, and provides a partial generalisation to non-prime alphabets. Together, these results reinforce the hardness of computing $A_{Ne}$ and illuminate the structural and circuit-theoretic limits of words with low non-deterministic automatic complexity, contributing to the broader understanding of automatic complexity as a computationally intractable measure.

Abstract

The automatic complexity of a finite word (string) is an analogue for finite automata of Sipser's distinguishing complexity (1983) and was introduced by Shallit and Wang (2001). For a finite alphabet $Σ$ of at least two elements, we consider the non-deterministic automatic complexity given by exactly - yet not necessarily uniquely - accepting automata: a word $x \in Σ^*$ has exact non-deterministic automatic complexity $k \in \mathbb{N}$ if there exists a non-deterministic automaton of $k$ states which accepts $x$ while rejecting every other word of the same length as $x$, and no automaton of fewer states has this property. Importantly, and in contrast to the classical notion, the witnessing automaton may have multiple paths of computation accepting $x$. We denote this measure of complexity by $A_{Ne}$, and study a class of languages of low $A_{Ne}$-complexity defined as $L_q = \{ \, x \in Σ^* : A_{Ne}(x) < q|x| \, \}$, which is parameterised by rationals $q \in (0,1/2)$ (generalising a class of sets first studied by Kjos-Hanssen). We show that for every $q \in (0,1/2)$, this class is neither context-free nor recognisable by certain Boolean circuits. In the process, we answer an open question of Kjos-Hanssen quantifying the complexity of $L_{1/3}$ in terms of Boolean circuits, and also prove the Shannon effect for $A_{Ne}$.

Languages of Words of Low Automatic Complexity Are Hard to Compute

TL;DR

This work introduces and analyzes the low-complexity languages

for

, where

is the non-deterministic automatic complexity defined via NFAs that exactly accept a word. It proves that each

is not context-free and not recognizable by several constant-depth circuit families (including

and

on prime alphabets), and it establishes the Shannon effect for

. The paper also answers a question of Kjos-Hanssen by showing

is not

-recognisable, and provides a partial generalisation to non-prime alphabets. Together, these results reinforce the hardness of computing

and illuminate the structural and circuit-theoretic limits of words with low non-deterministic automatic complexity, contributing to the broader understanding of automatic complexity as a computationally intractable measure.

Abstract

The automatic complexity of a finite word (string) is an analogue for finite automata of Sipser's distinguishing complexity (1983) and was introduced by Shallit and Wang (2001). For a finite alphabet

of at least two elements, we consider the non-deterministic automatic complexity given by exactly - yet not necessarily uniquely - accepting automata: a word

has exact non-deterministic automatic complexity

if there exists a non-deterministic automaton of

states which accepts

while rejecting every other word of the same length as

, and no automaton of fewer states has this property. Importantly, and in contrast to the classical notion, the witnessing automaton may have multiple paths of computation accepting

. We denote this measure of complexity by

, and study a class of languages of low

-complexity defined as

, which is parameterised by rationals

(generalising a class of sets first studied by Kjos-Hanssen). We show that for every

, this class is neither context-free nor recognisable by certain Boolean circuits. In the process, we answer an open question of Kjos-Hanssen quantifying the complexity of

in terms of Boolean circuits, and also prove the Shannon effect for

Languages of Words of Low Automatic Complexity Are Hard to Compute

TL;DR

Abstract

Languages of Words of Low Automatic Complexity Are Hard to Compute

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (1)

Theorems & Definitions (62)