Table of Contents
Fetching ...

Congruence-based Learning of Probabilistic Deterministic Finite Automata

Matías Carrasco, Franz Mayr, Sergio Yovine

TL;DR

The paper addresses learning probabilistic deterministic finite automata (PDFA) from language models by developing an algebraic framework built on similarities and equivalences over probability distributions. It introduces a congruence-based generalization of the Myhill-Nerode theory, defines a quotient PDFA, and proposes the learning algorithm $\mathrm{L_{\mathcal{E}}^\ast}$ to recover the quotient from a language model when regular, proving correctness and termination. A central result is that, for congruences, recognizability of language models coincides with regularity; in contrast, learning with tolerance relations does not guarantee regularity, raising limits for such approaches. Together, these findings formalize a principled pathway for PDOF(A) learning and illuminate the conditions under which practical learning algorithms can faithfully recover minimal automata representations of language-model behavior.

Abstract

This work studies the question of learning probabilistic deterministic automata from language models. For this purpose, it focuses on analyzing the relations defined on algebraic structures over strings by equivalences and similarities on probability distributions. We introduce a congruence that extends the classical Myhill-Nerode congruence for formal languages. This new congruence is the basis for defining regularity over language models. We present an active learning algorithm that computes the quotient with respect to this congruence whenever the language model is regular. The paper also defines the notion of recognizability for language models and shows that it coincides with regularity for congruences. For relations which are not congruences, it shows that this is not the case. Finally, it discusses the impact of this result on learning in the context of language models.

Congruence-based Learning of Probabilistic Deterministic Finite Automata

TL;DR

The paper addresses learning probabilistic deterministic finite automata (PDFA) from language models by developing an algebraic framework built on similarities and equivalences over probability distributions. It introduces a congruence-based generalization of the Myhill-Nerode theory, defines a quotient PDFA, and proposes the learning algorithm to recover the quotient from a language model when regular, proving correctness and termination. A central result is that, for congruences, recognizability of language models coincides with regularity; in contrast, learning with tolerance relations does not guarantee regularity, raising limits for such approaches. Together, these findings formalize a principled pathway for PDOF(A) learning and illuminate the conditions under which practical learning algorithms can faithfully recover minimal automata representations of language-model behavior.

Abstract

This work studies the question of learning probabilistic deterministic automata from language models. For this purpose, it focuses on analyzing the relations defined on algebraic structures over strings by equivalences and similarities on probability distributions. We introduce a congruence that extends the classical Myhill-Nerode congruence for formal languages. This new congruence is the basis for defining regularity over language models. We present an active learning algorithm that computes the quotient with respect to this congruence whenever the language model is regular. The paper also defines the notion of recognizability for language models and shows that it coincides with regularity for congruences. For relations which are not congruences, it shows that this is not the case. Finally, it discusses the impact of this result on learning in the context of language models.

Paper Structure

This paper contains 21 sections, 35 theorems, 50 equations, 5 figures, 1 algorithm.

Key Result

Proposition 1

For every $\delta, \delta' \in \Delta(\Sigma_\$)$, if $\delta=_\kappa\delta'$ then $\delta\approx_{(\mathit{vd},\kappa^{-1})}\delta'$.

Figures (5)

  • Figure 1: Definition of $\overline{\pi}$.
  • Figure 2: Definition of $\beta$.
  • Figure 3: (Left) $A$. (Right) $B$.
  • Figure 4: PDFA $A$
  • Figure : $\mathrm{L_{\mathcal{E}}^\ast}$ learning algorithm

Theorems & Definitions (80)

  • Proposition 1
  • proof
  • Proposition 2
  • proof
  • Proposition 3
  • proof
  • Proposition 4
  • proof
  • Example 1
  • Proposition 5
  • ...and 70 more