Table of Contents
Fetching ...

The Trembling-Hand Problem for LTLf Planning

Pian Yu, Shufang Zhu, Giuseppe De Giacomo, Marta Kwiatkowska, Moshe Vardi

TL;DR

This work addresses trembling-hand errors in planning for temporally extended goals specified in $LTL_f$, covering both deterministic and nondeterministic environments. It develops a two-pronged solution: for deterministic domains, encode action-instruction errors into an $MDP$ and solve the $LTL_f$ objective via DFA-product methods; for nondeterministic domains, model the problem as an $MDPST$ with robust strategies and solve using a product with the DFA and a specialized value-iteration on a reduced sub-MDPST. A formal framework for $MDPSTs$ with $LTL_f$ objectives is introduced, including a robust satisfaction criterion and efficient Bellman equations. The approach is validated on a human-robot co-assembly scenario, demonstrating correctness and promising scalability through state-space pruning and a proof-of-concept implementation.

Abstract

Consider an agent acting to achieve its temporal goal, but with a "trembling hand". In this case, the agent may mistakenly instruct, with a certain (typically small) probability, actions that are not intended due to faults or imprecision in its action selection mechanism, thereby leading to possible goal failure. We study the trembling-hand problem in the context of reasoning about actions and planning for temporally extended goals expressed in Linear Temporal Logic on finite traces (LTLf), where we want to synthesize a strategy (aka plan) that maximizes the probability of satisfying the LTLf goal in spite of the trembling hand. We consider both deterministic and nondeterministic (adversarial) domains. We propose solution techniques for both cases by relying respectively on Markov Decision Processes and on Markov Decision Processes with Set-valued Transitions with LTLf objectives, where the set-valued probabilistic transitions capture both the nondeterminism from the environment and the possible action instruction errors from the agent. We formally show the correctness of our solution techniques and demonstrate their effectiveness experimentally through a proof-of-concept implementation.

The Trembling-Hand Problem for LTLf Planning

TL;DR

This work addresses trembling-hand errors in planning for temporally extended goals specified in , covering both deterministic and nondeterministic environments. It develops a two-pronged solution: for deterministic domains, encode action-instruction errors into an and solve the objective via DFA-product methods; for nondeterministic domains, model the problem as an with robust strategies and solve using a product with the DFA and a specialized value-iteration on a reduced sub-MDPST. A formal framework for with objectives is introduced, including a robust satisfaction criterion and efficient Bellman equations. The approach is validated on a human-robot co-assembly scenario, demonstrating correctness and promising scalability through state-space pruning and a proof-of-concept implementation.

Abstract

Consider an agent acting to achieve its temporal goal, but with a "trembling hand". In this case, the agent may mistakenly instruct, with a certain (typically small) probability, actions that are not intended due to faults or imprecision in its action selection mechanism, thereby leading to possible goal failure. We study the trembling-hand problem in the context of reasoning about actions and planning for temporally extended goals expressed in Linear Temporal Logic on finite traces (LTLf), where we want to synthesize a strategy (aka plan) that maximizes the probability of satisfying the LTLf goal in spite of the trembling hand. We consider both deterministic and nondeterministic (adversarial) domains. We propose solution techniques for both cases by relying respectively on Markov Decision Processes and on Markov Decision Processes with Set-valued Transitions with LTLf objectives, where the set-valued probabilistic transitions capture both the nondeterminism from the environment and the possible action instruction errors from the agent. We formally show the correctness of our solution techniques and demonstrate their effectiveness experimentally through a proof-of-concept implementation.
Paper Structure (12 sections, 4 theorems, 7 equations, 6 figures, 1 table, 1 algorithm)

This paper contains 12 sections, 4 theorems, 7 equations, 6 figures, 1 table, 1 algorithm.

Key Result

Theorem 1

Let $\mathcal{P}^d = (\mathcal{D}, \varphi, \mathcal{E})$ be a TH problem defined in Def. DD, and $\mathcal{M}$ the constructed MDP described above. An optimal strategy for $\mathcal{M}$ with $\varphi$ is an optimal strategy for $\mathcal{P}^d$ and vice versa, that is: where $\sigma_p^*$ (optimal strategy for $\mathcal{P}^d$) is given in Def. DD.

Figures (6)

  • Figure 1: An arch. Left: 2, 3, and 4 blocks. Right: 5 and 6 blocks.
  • Figure 2: A perturbed transition in a det. domain.
  • Figure 3: A perturbed transition in a nondet. domain.
  • Figure 4: An execution example of an optimal strategy for the arch-building task. Robot-intended actions, robot-executed actions, and human interventions are shown in black, brick, and blue, respectively.
  • Figure 5: Number of states and transitions in the constructed MDPST for $2\le |OBJ|\le 6$ (in log scale).
  • ...and 1 more figures

Theorems & Definitions (17)

  • Definition 1: Perturbed path in $\mathcal{D}$
  • Definition 2: TH problem for LTL$_f$ planning in $\mathcal{D}$
  • Example 1
  • Theorem 1
  • proof
  • Theorem 2
  • Definition 3: Perturbed paths in $\mathcal{N}$
  • Definition 4: TH problem for LTL$_f$ planning in $\mathcal{N}$
  • Example 2
  • Definition 5: Feasible distribution in MDPSTs
  • ...and 7 more