Table of Contents
Fetching ...

Analyzing Robustness of Angluin's L$^*$ Algorithm in Presence of Noise

Lina Ye, Igor Khmelnitsky, Serge Haddad, Benoît Barbot, Benedikt Bollig, Martin Leucker, Daniel Neider, Rajarshi Roy

TL;DR

This paper studies how Angluin's PAC variant of the L$^*$ algorithm (KV's algorithm) behaves when the access device is perturbed by noise. It introduces four noise models (random output, random input, counter DFA, and pathological behaviours) and evaluates robustness through extensive experiments on hundreds of randomly generated DFAs, paired with a theoretical analysis of randomness versus structure. The empirical results show robust learning under random noise but failure under structured noise, while the theory proves that random perturbations make the learned language almost surely non-recursively enumerable. The authors also propose a practical size-reduction strategy to counter overfitting and discuss implications for learning from noisy black-box devices in real-world settings.

Abstract

Angluin's L$^*$ algorithm learns the minimal deterministic finite automaton (DFA) of a regular language using membership and equivalence queries. Its probabilistic approximatively correct (PAC) version substitutes an equivalence query by numerous random membership queries to get a high level confidence to the answer. Thus it can be applied to any kind of device and may be viewed as an algorithm for synthesizing an automaton abstracting the behavior of the device based on observations. Here we are interested on how Angluin's PAC learning algorithm behaves for devices which are obtained from a DFA by introducing some noise. More precisely we study whether Angluin's algorithm reduces the noise and produces a DFA closer to the original one than the noisy device. We propose several ways to introduce the noise: (1) the noisy device inverts the classification of words w.r.t. the DFA with a small probability, (2) the noisy device modifies with a small probability the letters of the word before asking its classification w.r.t. the DFA, (3) the noisy device combines the classification of a word w.r.t. the DFA and its classification w.r.t. a counter automaton, and (4) the noisy DFA is obtained by a random process from two DFA such that the language of the first one is included in the second one. Then when a word is accepted (resp. rejected) by the first (resp. second) one, it is also accepted (resp. rejected) and in the remaining cases, it is accepted with probability 0.5. Our main experimental contributions consist in showing that: (1) Angluin's algorithm behaves well whenever the noisy device is produced by a random process, (2) but poorly with a structured noise, and, that (3) is able to eliminate pathological behaviours specified in a regular way. Theoretically, we show that randomness almost surely yields systems with non-recursively enumerable languages.

Analyzing Robustness of Angluin's L$^*$ Algorithm in Presence of Noise

TL;DR

This paper studies how Angluin's PAC variant of the L algorithm (KV's algorithm) behaves when the access device is perturbed by noise. It introduces four noise models (random output, random input, counter DFA, and pathological behaviours) and evaluates robustness through extensive experiments on hundreds of randomly generated DFAs, paired with a theoretical analysis of randomness versus structure. The empirical results show robust learning under random noise but failure under structured noise, while the theory proves that random perturbations make the learned language almost surely non-recursively enumerable. The authors also propose a practical size-reduction strategy to counter overfitting and discuss implications for learning from noisy black-box devices in real-world settings.

Abstract

Angluin's L algorithm learns the minimal deterministic finite automaton (DFA) of a regular language using membership and equivalence queries. Its probabilistic approximatively correct (PAC) version substitutes an equivalence query by numerous random membership queries to get a high level confidence to the answer. Thus it can be applied to any kind of device and may be viewed as an algorithm for synthesizing an automaton abstracting the behavior of the device based on observations. Here we are interested on how Angluin's PAC learning algorithm behaves for devices which are obtained from a DFA by introducing some noise. More precisely we study whether Angluin's algorithm reduces the noise and produces a DFA closer to the original one than the noisy device. We propose several ways to introduce the noise: (1) the noisy device inverts the classification of words w.r.t. the DFA with a small probability, (2) the noisy device modifies with a small probability the letters of the word before asking its classification w.r.t. the DFA, (3) the noisy device combines the classification of a word w.r.t. the DFA and its classification w.r.t. a counter automaton, and (4) the noisy DFA is obtained by a random process from two DFA such that the language of the first one is included in the second one. Then when a word is accepted (resp. rejected) by the first (resp. second) one, it is also accepted (resp. rejected) and in the remaining cases, it is accepted with probability 0.5. Our main experimental contributions consist in showing that: (1) Angluin's algorithm behaves well whenever the noisy device is produced by a random process, (2) but poorly with a structured noise, and, that (3) is able to eliminate pathological behaviours specified in a regular way. Theoretically, we show that randomness almost surely yields systems with non-recursively enumerable languages.
Paper Structure (39 sections, 5 theorems, 11 equations, 5 figures, 9 tables, 2 algorithms)

This paper contains 39 sections, 5 theorems, 11 equations, 5 figures, 9 tables, 2 algorithms.

Key Result

Lemma 4.1

Let $R$ be a random language over $\Sigma$. Let $(w_n)_{n\in \mathbb{N}}$ be a sequence of words of $\Sigma^*$. Let $W_n=\{w_i\}_{i<n}$ and $\rho_n=\max_{W\subseteq W_n}{\bf Pr}(R \cap W_n=W)$. Assume that $\lim_{n\rightarrow \infty} \rho_n=0$. Then, for all countable families of languages $\mathcal

Figures (5)

  • Figure 1: The experimental setup and the studied distances
  • Figure 2: Number of rounds analysis
  • Figure 3: A DFA $\mathcal{A}$ where $a^3\Sigma^*\cap \mathcal{L}(\mathcal{A})=\emptyset$.
  • Figure 4: Two DFA
  • Figure 5: A DFA $\mathcal{A}$ with $\mathcal{L}(\mathcal{A})=(a+b)^*a$

Theorems & Definitions (13)

  • Lemma 4.1
  • proof
  • Theorem 4.2
  • proof
  • Definition 4.3: Markov chain
  • Definition 4.4: Irreducibility and Periodicity
  • Definition 4.5: equal-length-distinguishing DFA
  • Theorem 4.6
  • proof
  • Proposition 4.7
  • ...and 3 more