The Myhill-Nerode Theorem for Bounded Interaction: Canonical Abstractions via Agent-Bounded Indistinguishability

Anthony T. Nixon

The Myhill-Nerode Theorem for Bounded Interaction: Canonical Abstractions via Agent-Bounded Indistinguishability

Anthony T. Nixon

Abstract

Any capacity-limited observer induces a canonical quotient on its environment: two situations that no bounded agent can distinguish are, for that agent, the same. We formalise this for finite POMDPs. A fixed probe family of finite-state controllers induces a closed-loop Wasserstein pseudometric on observation histories and a probe-exact quotient merging histories that no controller in the family can distinguish. The quotient is canonical, minimal, and unique-a bounded-interaction analogue of the Myhill-Nerode theorem. For clock-aware probes, it is exactly decision-sufficient for objectives that depend only on the agent's observations and actions; for latent-state rewards, we use an observation-Lipschitz approximation bound. The main theorem object is the clock-aware quotient; scalable deterministic-stationary experiments study a tractable coarsening with gap measured on small exact cases and explored empirically at larger scale. We validate theorem-level claims on Tiger and GridWorld. We also report operational case studies on Tiger, GridWorld, and RockSample as exploratory diagnostics of approximation behavior and runtime, not as theorem-facing evidence when no exact cross-family certificate is available; heavier stress tests are archived in the appendix and artifact package.

The Myhill-Nerode Theorem for Bounded Interaction: Canonical Abstractions via Agent-Bounded Indistinguishability

Abstract

Paper Structure (106 sections, 23 theorems, 54 equations, 5 figures, 31 tables, 2 algorithms)

This paper contains 106 sections, 23 theorems, 54 equations, 5 figures, 31 tables, 2 algorithms.

Introduction
Proof roadmap.
Related work.
Preliminaries
Bounded Indistinguishability and the Wasserstein Pseudometric
Equivalence on Histories and the Quotient POMDP
Quotient Construction
Structural Properties
epsilon-Quotients and Data-Processing Monotonicity
Operational Approximation Machinery
Structural vs. operational role.
Practical tractability.
Greedy controller-subset selection.
Operational certificate for controller subsets.
Experiments
...and 91 more sections

Key Result

Proposition 3.2

$D_T^\Pi$ is a pseudometric on POMDPs sharing $(A,O)$, with $D_T^\Pi(M,N) = 0$ iff $P_M^\pi = P_N^\pi$ for all $\pi \in \Pi$.

Figures (5)

Figure 1: (a) Any system with hidden state, an observation channel, and a bounded observer $(m, T, \delta_O)$ admits a canonical quotient---a minimal equivalence over situations the observer cannot distinguish. (b) The POMDP instantiation: observation histories are probed by all bounded FSCs; pairwise $\mathcal{W}_1$ distances (maximised over policies) are clustered; the quotient POMDP preserves all observation laws with certified value loss.
Figure 2: Refinement lattice for Tiger ($T{=}4$). Solid arrows: increasing $\varepsilon$ coarsens the quotient (Definition \ref{['def:eps-equiv']}). Dashed arrows: increasing $m$ refines the quotient (Proposition \ref{['prop:probe-class-coarsening']}---larger probe class $\Rightarrow$ finer partition). Data from operational capacity sweep.
Figure 3: Tiger POMDP quotient structures for $m=1$, $T=2$. Left: operational exact-for-family quotient (7 classes). Right: $\varepsilon=0.5$ quotient merges $L$ and $R$ (3 classes).
Figure 4: Operational runtime scaling with horizon (log scale): monolithic, layered, and empirical calibrated-subset tracks.
Figure 5: Layered bound validation (Theorem \ref{['thm:layered-composition']}). Each bar compares the empirical LHS distortion (blue) against the theoretical RHS bound (orange) for eight test cases spanning single-layer reduction, wrapper contraction (deterministic and stochastic), and long-horizon theorem bounds ($T \in \{4, 8, 10\}$). All checks satisfy LHS $\leq$ RHS.

Theorems & Definitions (62)

Definition 2.1: Finite POMDP
Definition 2.2: Stochastic Finite-State Controller
Definition 3.1: Closed-loop Wasserstein pseudometric
Proposition 3.2
proof
Remark 3.3: Relation to bisimulation metrics
Definition 3.4: $L_R$-observation-Lipschitz reward
Remark 3.5: When is $L_R$ small?
Theorem 3.6: Value-function error bound
proof
...and 52 more

The Myhill-Nerode Theorem for Bounded Interaction: Canonical Abstractions via Agent-Bounded Indistinguishability

Abstract

The Myhill-Nerode Theorem for Bounded Interaction: Canonical Abstractions via Agent-Bounded Indistinguishability

Authors

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (62)