Table of Contents
Fetching ...

Logically Constrained Robotics Transformers for Enhanced Perception-Action Planning

Parv Kapoor, Sai Vemprala, Ashish Kapoor

TL;DR

This paper tackles aligning foundation-model-based trajectory planning with precise safety constraints expressed as Signal Temporal Logic (STL). It introduces PASTEL, a specification-conditioned, cross-attention augmented decoder-only transformer that integrates repeated STL tokens and specification embeddings to steer trajectory predictions toward φ-satisfying sequences, building on a pretrained PACT model. A STLPy-based dataset enables pretraining and evaluation in a 2D planar setting, and the approach achieves substantially higher specification satisfaction than baselines across multiple STL patterns, while respecting actuation constraints. This work demonstrates a practical path to safer, more reliable robotics by combining formal STL specifications with large-scale, data-driven planning, with clear avenues for future improvements such as decomposition of long-horizon specs and robustness enhancements.

Abstract

With the advent of large foundation model based planning, there is a dire need to ensure their output aligns with the stakeholder's intent. When these models are deployed in the real world, the need for alignment is magnified due to the potential cost to life and infrastructure due to unexpected faliures. Temporal Logic specifications have long provided a way to constrain system behaviors and are a natural fit for these use cases. In this work, we propose a novel approach to factor in signal temporal logic specifications while using autoregressive transformer models for trajectory planning. We also provide a trajectory dataset for pretraining and evaluating foundation models. Our proposed technique acheives 74.3 % higher specification satisfaction over the baselines.

Logically Constrained Robotics Transformers for Enhanced Perception-Action Planning

TL;DR

This paper tackles aligning foundation-model-based trajectory planning with precise safety constraints expressed as Signal Temporal Logic (STL). It introduces PASTEL, a specification-conditioned, cross-attention augmented decoder-only transformer that integrates repeated STL tokens and specification embeddings to steer trajectory predictions toward φ-satisfying sequences, building on a pretrained PACT model. A STLPy-based dataset enables pretraining and evaluation in a 2D planar setting, and the approach achieves substantially higher specification satisfaction than baselines across multiple STL patterns, while respecting actuation constraints. This work demonstrates a practical path to safer, more reliable robotics by combining formal STL specifications with large-scale, data-driven planning, with clear avenues for future improvements such as decomposition of long-horizon specs and robustness enhancements.

Abstract

With the advent of large foundation model based planning, there is a dire need to ensure their output aligns with the stakeholder's intent. When these models are deployed in the real world, the need for alignment is magnified due to the potential cost to life and infrastructure due to unexpected faliures. Temporal Logic specifications have long provided a way to constrain system behaviors and are a natural fit for these use cases. In this work, we propose a novel approach to factor in signal temporal logic specifications while using autoregressive transformer models for trajectory planning. We also provide a trajectory dataset for pretraining and evaluating foundation models. Our proposed technique acheives 74.3 % higher specification satisfaction over the baselines.
Paper Structure (16 sections, 10 equations, 1 figure, 2 tables)

This paper contains 16 sections, 10 equations, 1 figure, 2 tables.

Figures (1)

  • Figure 1: The PASTEL Architecture utilizes both causal and cross attention mechanisms to autoregressively predict Signal Temporal Logic (STL) specification satisfying trajectories conditioned on state, action, and specification embeddings.