Table of Contents
Fetching ...

Learning Quantitative Automata Modulo Theories

Eric Hsiung, Swarat Chaudhuri, Joydeep Biswas

TL;DR

QUINTIC is presented, an active learning algorithm, wherein the learner infers a valid automaton through deductive reasoning, by applying a theory to a set of currently available constraints and an assumed preference model and quantitative automaton class.

Abstract

Quantitative automata are useful representations for numerous applications, including modeling probability distributions over sequences to Markov chains and reward machines. Actively learning such automata typically occurs using explicitly gathered input-output examples under adaptations of the L-star algorithm. However, obtaining explicit input-output pairs can be expensive, and there exist scenarios, including preference-based learning or learning from rankings, where providing constraints is a less exerting and a more natural way to concisely describe desired properties. Consequently, we propose the problem of learning deterministic quantitative automata from sets of constraints over the valuations of input sequences. We present QUINTIC, an active learning algorithm, wherein the learner infers a valid automaton through deductive reasoning, by applying a theory to a set of currently available constraints and an assumed preference model and quantitative automaton class. QUINTIC performs a complete search over the space of automata, and is guaranteed to be minimal and correctly terminate. Our evaluations utilize theory of rationals in order to learn summation, discounted summation, product, and classification quantitative automata, and indicate QUINTIC is effective at learning these types of automata.

Learning Quantitative Automata Modulo Theories

TL;DR

QUINTIC is presented, an active learning algorithm, wherein the learner infers a valid automaton through deductive reasoning, by applying a theory to a set of currently available constraints and an assumed preference model and quantitative automaton class.

Abstract

Quantitative automata are useful representations for numerous applications, including modeling probability distributions over sequences to Markov chains and reward machines. Actively learning such automata typically occurs using explicitly gathered input-output examples under adaptations of the L-star algorithm. However, obtaining explicit input-output pairs can be expensive, and there exist scenarios, including preference-based learning or learning from rankings, where providing constraints is a less exerting and a more natural way to concisely describe desired properties. Consequently, we propose the problem of learning deterministic quantitative automata from sets of constraints over the valuations of input sequences. We present QUINTIC, an active learning algorithm, wherein the learner infers a valid automaton through deductive reasoning, by applying a theory to a set of currently available constraints and an assumed preference model and quantitative automaton class. QUINTIC performs a complete search over the space of automata, and is guaranteed to be minimal and correctly terminate. Our evaluations utilize theory of rationals in order to learn summation, discounted summation, product, and classification quantitative automata, and indicate QUINTIC is effective at learning these types of automata.

Paper Structure

This paper contains 26 sections, 2 theorems, 16 equations, 13 figures, 1 table, 1 algorithm.

Key Result

theorem 1

Quintic performs a complete search of automata space.

Figures (13)

  • Figure 1: Upper left: closure operation on symbolic observation table. Upper right: consistency operation on symbolic observation table. Bottom Left: high-level Remap algorithm, which is an algorithm based on $\textrm{L}^*$. Bottom Right: Details for RemapSymbolicFill and MakeHypothesis procedures.
  • Figure 2: Example constraints obtained under Quintic for preference queries (left) and equivalence queries (right) under non-discounted summation $\mathbf{Val}$ and $\mathcal{V}$ functions.
  • Figure 3: Variable equivalence inference rules encoded as a decision tree. Preference query pairs indicate the relation $\mathbf{R}$ in $T(s\cdot\sigma) \mathbf{R} T(s'\cdot\sigma') \iff \mathbf{Val}(s\cdot \sigma)+\mathbf{Val}(s') \mathbf{R} \mathbf{Val}(s'\cdot \sigma') + \mathbf{Val}(s)$ for non-discounted summation valuation. For $\mathbf{R}$ to remain unknown, both preference query pairs must reflect that $\mathbf{R}$ is unknown. Otherwise, the relation $\mathbf{R}$ can be determined.
  • Figure 4: Results for 7 targets (columns) out of 25. Full results in Appendix. Each row is corresponds to a metric. Legend: S (strong feedback), W (weak feedback), VE (single MaxSMT objective Equation \ref{['eq:max-smt']}), CC-VE (joint MaxSMT objectives closedness and consistency objective, and Equation \ref{['eq:max-smt']}), Remap is the baseline. Each bar is the empirical mean of 100 trials, and error bars represent 1 standard deviation.
  • Figure 5: Counterexample Expansion Ablation, measuring number of MaxSMT objective solves, under non-discounted summation valuation, with IDS enabled. This measures the number of $(\mathcal{E,O})$ pairs considered.
  • ...and 8 more figures

Theorems & Definitions (7)

  • definition 1: Unified
  • definition 2: Closed
  • definition 3: Consistent
  • theorem 1: Completeness
  • proof
  • theorem 2: Minimalism
  • proof