Active Learning of Symbolic Automata Over Rational Numbers

Sebastian Hagedorn; Martín Muñoz; Cristian Riveros; Rodrigo Toro Icarte

Active Learning of Symbolic Automata Over Rational Numbers

Sebastian Hagedorn, Martín Muñoz, Cristian Riveros, Rodrigo Toro Icarte

TL;DR

This work extends Angluin's L^* algorithm to learn symbolic automata over the rational numbers by embedding a MAT framework that handles infinite alphabets. The core idea learns finite piecewise functions over $\,\mathbb{Q}\, $ using a Stern–Brocot based interval learning scheme, and then integrates this into the L^* style framework to learn Symbolic Finite Automata with inequality predicates. The authors prove that the resulting learning process uses a linear number of membership and equivalence queries in the size of the target representation, achieving optimal query efficiency and removing restrictions on counterexample forms. The approach applies to practical settings such as RGX and time-series analysis, and it leverages break links and convergents in the Stern–Brocot tree to efficiently identify interval endpoints. Overall, the paper delivers a theoretically solid, query-efficient method for learning SFAs over dense rational alphabets with broad applicability in AI and software engineering.

Abstract

Automata learning has many applications in artificial intelligence and software engineering. Central to these applications is the $L^*$ algorithm, introduced by Angluin. The $L^*$ algorithm learns deterministic finite-state automata (DFAs) in polynomial time when provided with a minimally adequate teacher. Unfortunately, the $L^*$ algorithm can only learn DFAs over finite alphabets, which limits its applicability. In this paper, we extend $L^*$ to learn symbolic automata whose transitions use predicates over rational numbers, i.e., over infinite and dense alphabets. Our result makes the $L^*$ algorithm applicable to new settings like (real) RGX, and time series. Furthermore, our proposed algorithm is optimal in the sense that it asks a number of queries to the teacher that is at most linear with respect to the number of transitions, and to the representation size of the predicates.

Active Learning of Symbolic Automata Over Rational Numbers

TL;DR

using a Stern–Brocot based interval learning scheme, and then integrates this into the L^* style framework to learn Symbolic Finite Automata with inequality predicates. The authors prove that the resulting learning process uses a linear number of membership and equivalence queries in the size of the target representation, achieving optimal query efficiency and removing restrictions on counterexample forms. The approach applies to practical settings such as RGX and time-series analysis, and it leverages break links and convergents in the Stern–Brocot tree to efficiently identify interval endpoints. Overall, the paper delivers a theoretically solid, query-efficient method for learning SFAs over dense rational alphabets with broad applicability in AI and software engineering.

Abstract

Automata learning has many applications in artificial intelligence and software engineering. Central to these applications is the

algorithm, introduced by Angluin. The

algorithm learns deterministic finite-state automata (DFAs) in polynomial time when provided with a minimally adequate teacher. Unfortunately, the

algorithm can only learn DFAs over finite alphabets, which limits its applicability. In this paper, we extend

to learn symbolic automata whose transitions use predicates over rational numbers, i.e., over infinite and dense alphabets. Our result makes the

algorithm applicable to new settings like (real) RGX, and time series. Furthermore, our proposed algorithm is optimal in the sense that it asks a number of queries to the teacher that is at most linear with respect to the number of transitions, and to the representation size of the predicates.

Active Learning of Symbolic Automata Over Rational Numbers

TL;DR

Abstract

Active Learning of Symbolic Automata Over Rational Numbers

TL;DR

Abstract

Paper Structure

Table of Contents

Key Result

Figures (2)

Theorems & Definitions (14)