Table of Contents
Fetching ...

STL-SVPIO: Signal Temporal Logic guided Stein Variational Path Integral Optimization

Hongrui Zheng, Zirui Zang, Ahmad Amine, Cristian Ioan Vasile, Rahul Mangharam

Abstract

Signal Temporal Logic (STL) enables formal specification of complex spatiotemporal constraints for robotic task planning. However, synthesizing long-horizon continuous control trajectories from complex STL specifications is fundamentally challenging due to the nested structure of STL robustness objectives. Existing solver-based methods, such as Mixed-Integer Linear Programming (MILP), suffer from exponential scaling, whereas sampling methods, such as Model-Predictive Path Integral control (MPPI), struggle with sparse, long-horizon costs. We introduce Signal Temporal Logic guided Stein Variational Path Integral Optimization (STL-SVPIO), which reframes STL as a globally informative, differentiable reward-shaping mechanism. By leveraging Stein Variational Gradient Descent and differentiable physics engines, STL-SVPIO transports a mutually repulsive swarm of control particles toward high robustness regions. Our method transforms sparse logical satisfaction into tractable variational inference, mitigating the severe local minima traps of standard gradient-based methods. We demonstrate that STL-SVPIO significantly outperforms existing methods in both robustness and efficiency for traditional STL tasks. Moreover, it solves complex long-horizon tasks, including multi-agent coordination with synchronization and queuing while baselines either fail to discover feasible solutions, or become computationally intractable. Finally, we use STL-SVPIO in agile robotic motion planning tasks with nonlinear dynamics, such as 7-DoF manipulation and half cheetah back flips to show the generalizability of our algorithm.

STL-SVPIO: Signal Temporal Logic guided Stein Variational Path Integral Optimization

Abstract

Signal Temporal Logic (STL) enables formal specification of complex spatiotemporal constraints for robotic task planning. However, synthesizing long-horizon continuous control trajectories from complex STL specifications is fundamentally challenging due to the nested structure of STL robustness objectives. Existing solver-based methods, such as Mixed-Integer Linear Programming (MILP), suffer from exponential scaling, whereas sampling methods, such as Model-Predictive Path Integral control (MPPI), struggle with sparse, long-horizon costs. We introduce Signal Temporal Logic guided Stein Variational Path Integral Optimization (STL-SVPIO), which reframes STL as a globally informative, differentiable reward-shaping mechanism. By leveraging Stein Variational Gradient Descent and differentiable physics engines, STL-SVPIO transports a mutually repulsive swarm of control particles toward high robustness regions. Our method transforms sparse logical satisfaction into tractable variational inference, mitigating the severe local minima traps of standard gradient-based methods. We demonstrate that STL-SVPIO significantly outperforms existing methods in both robustness and efficiency for traditional STL tasks. Moreover, it solves complex long-horizon tasks, including multi-agent coordination with synchronization and queuing while baselines either fail to discover feasible solutions, or become computationally intractable. Finally, we use STL-SVPIO in agile robotic motion planning tasks with nonlinear dynamics, such as 7-DoF manipulation and half cheetah back flips to show the generalizability of our algorithm.
Paper Structure (21 sections, 27 equations, 9 figures, 1 table, 1 algorithm)

This paper contains 21 sections, 27 equations, 9 figures, 1 table, 1 algorithm.

Figures (9)

  • Figure 1: Overview of the STL-SVPIO Optimization Algorithm.
  • Figure 2: Selected trajectories of each method through iterations.
  • Figure 3: Aggregated average robustness and satisfaction rate of methods. All algorithms besides MILP were tested across 100 different random sampling seeds. Each solution with positive robustness counts as a satisfaction. Multiagent Button corresponds to the scenario in Section \ref{['sec:exp_button']}, Multiagent Corridor corresponds to the scenario in Section \ref{['sec:exp_corridor']}, Multiagent Sync Goals corresponds to the scenario in Section \ref{['sec:exp_sync']}, and Long Horizon corresponds to the scenario in Section \ref{['sec:exp_long_horizon']}. Gurobi was given a 10-hour time limit to solve the MILP.
  • Figure 4: Long horizon trajectory synthesized by STL-SVPIO through a cluttered obstacle field. Dark green and red markers indicate the start and final states, respectively.
  • Figure 5: STL-SVPIO synthesizes a multi-agent trajectory that satisfies strict ordering constraints between task events.
  • ...and 4 more figures