Learning Explainable and Better Performing Representations of POMDP Strategies

Alexander Bork; Debraj Chakraborty; Kush Grover; Jan Kretinsky; Stefanie Mohr

Learning Explainable and Better Performing Representations of POMDP Strategies

Alexander Bork, Debraj Chakraborty, Kush Grover, Jan Kretinsky, Stefanie Mohr

TL;DR

This work presents a method to learn an automaton representation of a strategy using a modification of the L*-algorithm, which is dramatically smaller and thus also more explainable than the tabular representation of a strategy.

Abstract

Strategies for partially observable Markov decision processes (POMDP) typically require memory. One way to represent this memory is via automata. We present a method to learn an automaton representation of a strategy using a modification of the L*-algorithm. Compared to the tabular representation of a strategy, the resulting automaton is dramatically smaller and thus also more explainable. Moreover, in the learning process, our heuristics may even improve the strategy's performance. In contrast to approaches that synthesize an automaton directly from the POMDP thereby solving it, our approach is incomparably more scalable.

Learning Explainable and Better Performing Representations of POMDP Strategies

TL;DR

Abstract

Paper Structure (24 sections, 11 figures, 7 tables, 1 algorithm)

This paper contains 24 sections, 11 figures, 7 tables, 1 algorithm.

Introduction
Partially Observable Markov Decision Processes (POMDPs)
Strategy Representation.
Current Approaches
Our Contribution
Related Work
Preliminaries
Learning a Finite-State Controller
Automaton Learning
Learning Table
From Learning Table to FSC
Algorithm
Proof of Concept: Belief Exploration
Improving Learned FSCs for Incomplete Information
Experimental Evaluation
...and 9 more sections

Figures (11)

Figure 1: Running example: POMDP
Figure 1: Example strategy table for the POMDP in \ref{['ex:pomdp']}. It only contains observation sequences of length at most 2.
Figure 2: Depiction of the FSC learning framework
Figure 3: FSC representing the strategy table of \ref{['tab:lookup-table']}.
Figure 4: Running example - initial table
...and 6 more figures

Theorems & Definitions (9)

definition 1: MDP
definition 2: POMDP
definition 3: Strategy
definition 4: Finite-State Controller
definition 5: Strategy Table
definition 6: Output Query (OQ)
definition 7: Equivalence Query (EQ)
definition 8: Learning Table
definition 9: Learned FSC

Learning Explainable and Better Performing Representations of POMDP Strategies

TL;DR

Abstract

Learning Explainable and Better Performing Representations of POMDP Strategies

Authors

TL;DR

Abstract

Table of Contents

Figures (11)

Theorems & Definitions (9)