Table of Contents
Fetching ...

Analyzing constrained LLM through PDFA-learning

Matías Carrasco, Franz Mayr, Sergio Yovine, Johny Kidd, Martín Iturbide, Juan Pedro da Silva, Alejo Garat

TL;DR

An algorithm is developed for efficiently learning the quotient with respect to this congruence that copes with null next-symbol probabilities that arise when the output of a language model is constrained by some means during text generation.

Abstract

We define a congruence that copes with null next-symbol probabilities that arise when the output of a language model is constrained by some means during text generation. We develop an algorithm for efficiently learning the quotient with respect to this congruence and evaluate it on case studies for analyzing statistical properties of LLM.

Analyzing constrained LLM through PDFA-learning

TL;DR

An algorithm is developed for efficiently learning the quotient with respect to this congruence that copes with null next-symbol probabilities that arise when the output of a language model is constrained by some means during text generation.

Abstract

We define a congruence that copes with null next-symbol probabilities that arise when the output of a language model is constrained by some means during text generation. We develop an algorithm for efficiently learning the quotient with respect to this congruence and evaluate it on case studies for analyzing statistical properties of LLM.
Paper Structure (19 sections, 7 theorems, 19 equations, 9 figures, 2 tables, 1 algorithm)

This paper contains 19 sections, 7 theorems, 19 equations, 9 figures, 2 tables, 1 algorithm.

Key Result

Proposition 2.1

For all $u,v\in\Sigma^\ast.\ u \equiv v$ if and only if Proof. See Appendix proof:prop_indist_new.

Figures (9)

  • Figure 1: PDFA $\mathcal{A}$ (left) and $\mathcal{B}$ (right) over $\Sigma = \{a, b\}$ with $q_{\mathrm{in}} = q_0$.
  • Figure 2: Difference between $\equiv^\bullet_{E}$ and $\equiv_{E}$
  • Figure 3: Running time curves: (left) As function of $\theta$ (right) As function of $n$
  • Figure 4: Synchronization: (left) $\mathcal{L}$ (center) $\mathcal{G}$ (right) $\mathcal{B} = \mathsf{samptop}_{2}(\mathcal{L}\times\mathcal{G})$
  • Figure 5: Distributions of floats and the lengths of their representing strings (digit sampling).
  • ...and 4 more figures

Theorems & Definitions (13)

  • Proposition 2.1
  • Proposition 2.2
  • Proposition 2.3
  • Corollary 2.1
  • Proposition 2.4
  • Proposition 3.1
  • proof
  • proof
  • proof
  • proof
  • ...and 3 more