Table of Contents
Fetching ...

Lifted Forward Planning in Relational Factored Markov Decision Processes with Concurrent Actions

Florian Andreas Marwitz, Tanya Braun, Ralf Möller, Marcel Gehrke

TL;DR

A first-order representation to store the spaces in polynomial instead of exponential size in the number of objects is presented and Foreplan, a relational forward planner, is introduced, which uses this representation to efficiently compute policies for numerous indistinguishable objects and actions.

Abstract

Decision making is a central problem in AI that can be formalized using a Markov Decision Process. A problem is that, with increasing numbers of (indistinguishable) objects, the state space grows exponentially. To compute policies, the state space has to be enumerated. Even more possibilities have to be enumerated if the size of the action space depends on the size of the state space, especially if we allow concurrent actions. To tackle the exponential blow-up in the action and state space, we present a first-order representation to store the spaces in polynomial instead of exponential size in the number of objects and introduce Foreplan, a relational forward planner, which uses this representation to efficiently compute policies for numerous indistinguishable objects and actions. Additionally, we introduce an even faster approximate version of Foreplan. Moreover, Foreplan identifies how many objects an agent should act on to achieve a certain task given restrictions. Further, we provide a theoretical analysis and an empirical evaluation of Foreplan, demonstrating a speedup of at least four orders of magnitude.

Lifted Forward Planning in Relational Factored Markov Decision Processes with Concurrent Actions

TL;DR

A first-order representation to store the spaces in polynomial instead of exponential size in the number of objects is presented and Foreplan, a relational forward planner, is introduced, which uses this representation to efficiently compute policies for numerous indistinguishable objects and actions.

Abstract

Decision making is a central problem in AI that can be formalized using a Markov Decision Process. A problem is that, with increasing numbers of (indistinguishable) objects, the state space grows exponentially. To compute policies, the state space has to be enumerated. Even more possibilities have to be enumerated if the size of the action space depends on the size of the state space, especially if we allow concurrent actions. To tackle the exponential blow-up in the action and state space, we present a first-order representation to store the spaces in polynomial instead of exponential size in the number of objects and introduce Foreplan, a relational forward planner, which uses this representation to efficiently compute policies for numerous indistinguishable objects and actions. Additionally, we introduce an even faster approximate version of Foreplan. Moreover, Foreplan identifies how many objects an agent should act on to achieve a certain task given restrictions. Further, we provide a theoretical analysis and an empirical evaluation of Foreplan, demonstrating a speedup of at least four orders of magnitude.

Paper Structure

This paper contains 38 sections, 13 theorems, 47 equations, 3 figures, 2 tables.

Key Result

Theorem 11

The representation in Definition def:state-representation is correct.

Figures (3)

  • Figure 1: Lifted representation of the transition model for Example \ref{['ex:epidemic']}. We abbreviate by using only the first letter(s) for each symbol.
  • Figure 2: Runtime of (Approximate) Foreplan, ALP and XADD Symbolic Value Iteration on the epidemic example for up to 22 persons with a time limit of two hours.
  • Figure 3: Runtime (logscale) of (Approximate) Foreplan, ALP and XADD Symbolic Value Iteration on the epidemic example with a time limit of two hours and a memory limit of 16 GB. Only runs within these limits are shown.

Theorems & Definitions (52)

  • Definition 1: Markov Decision Process
  • Example 1
  • Definition 2: Bellman Equation mdp
  • Definition 3: Factored MDP
  • Definition 4: Parfactor model parfactor_definitions
  • Example 2
  • Example 3: Epidemic
  • Definition 5: Relational Factored MDPs
  • Definition 6: Action PRV
  • Example 4: Action PRV
  • ...and 42 more