Table of Contents
Fetching ...

Multi-Agent Temporal Logic Planning via Penalty Functions and Block-Coordinate Optimization

Eleftherios E. Vlahakis, Arash Bahari Kordabad, Lars Lindemann, Pantelis Sopasakis, Sadegh Soudjani, Dimos V. Dimarogonas

TL;DR

This work forms STL planning as an optimization program under arbitrary multi-agent constraints and introduces a penalty-based unconstrained relaxation that can be efficiently solved via a Block-Coordinate Gradient Descent (BCGD) method, where each block corresponds to a single agent's decision variables, thereby mitigating complexity.

Abstract

Multi-agent planning under Signal Temporal Logic (STL) is often hindered by collaborative tasks that lead to computational challenges due to the inherent high-dimensionality of the problem, preventing scalable synthesis with satisfaction guarantees. To address this, we formulate STL planning as an optimization program under arbitrary multi-agent constraints and introduce a penalty-based unconstrained relaxation that can be efficiently solved via a Block-Coordinate Gradient Descent (BCGD) method, where each block corresponds to a single agent's decision variables, thereby mitigating complexity. By utilizing a quadratic penalty function defined via smooth STL semantics, we show that BCGD iterations converge to a stationary point of the penalized problem under standard regularity assumptions. To enforce feasibility, the BCGD solver is embedded within a two-layer optimization scheme: inner BCGD updates are performed for a fixed penalty parameter, which is then increased in an outer loop to progressively improve multi-agent STL robustness. The proposed framework enables scalable computations and is validated through various complex multi-robot planning scenarios.

Multi-Agent Temporal Logic Planning via Penalty Functions and Block-Coordinate Optimization

TL;DR

This work forms STL planning as an optimization program under arbitrary multi-agent constraints and introduces a penalty-based unconstrained relaxation that can be efficiently solved via a Block-Coordinate Gradient Descent (BCGD) method, where each block corresponds to a single agent's decision variables, thereby mitigating complexity.

Abstract

Multi-agent planning under Signal Temporal Logic (STL) is often hindered by collaborative tasks that lead to computational challenges due to the inherent high-dimensionality of the problem, preventing scalable synthesis with satisfaction guarantees. To address this, we formulate STL planning as an optimization program under arbitrary multi-agent constraints and introduce a penalty-based unconstrained relaxation that can be efficiently solved via a Block-Coordinate Gradient Descent (BCGD) method, where each block corresponds to a single agent's decision variables, thereby mitigating complexity. By utilizing a quadratic penalty function defined via smooth STL semantics, we show that BCGD iterations converge to a stationary point of the penalized problem under standard regularity assumptions. To enforce feasibility, the BCGD solver is embedded within a two-layer optimization scheme: inner BCGD updates are performed for a fixed penalty parameter, which is then increased in an outer loop to progressively improve multi-agent STL robustness. The proposed framework enables scalable computations and is validated through various complex multi-robot planning scenarios.
Paper Structure (14 sections, 3 theorems, 16 equations, 1 figure, 1 table, 2 algorithms)

This paper contains 14 sections, 3 theorems, 16 equations, 1 figure, 1 table, 2 algorithms.

Key Result

Proposition 1

Consider the formula $\phi$ in eq:global_phi, where $\mathcal{K}_\phi$ contains at least two cliques, i.e., $|\mathcal{K}_\phi| \geq 2$. Then, for all multi-agent sequences $\bm{u}$ and $\Gamma > 0$, the smooth robustness satisfies $\varrho_\Gamma^\phi(\bm{u}) {<} \rho^\phi(\bm{u})$. Consequently, $

Figures (1)

  • Figure 1: RURAMCA motion planning for ten robots with discrete-time unicycle dynamics: $z_i(t+1)=z_i(t)+v_i(t)\cos\theta_i(t)$, $y_i(t+1)=y_i(t)+v_i(t)\sin\theta_i(t)$, and $\theta_i(t+1)=\theta_i(t)+\omega_i(t)$, where $(z_i(t),y_i(t))\in{\rm I\!R}^2$ denote the Cartesian position of the $i^{\text{th}}$ robot, $\theta_i(t)\in{\rm I\!R}$ its heading, and $v_i(t)\in{\rm I\!R}$, $\omega_i(t)\in{\rm I\!R}$ the linear and angular control inputs, respectively, with a sampling period of $1$ s. The trajectories are generated from initial conditions (colored squares) using input sequences obtained via Alg. \ref{['alg:penalty_method']} (using Alg. \ref{['alg:BCGD']} as inner solver). Simulation parameters: $\Gamma=2$ for inner smooth operators, $\Gamma=1$ for the outer softmin in \ref{['eq:joint_STL_constraints']}, $\sigma=0.5$ and $\gamma=0.995$ for the Armijo rule, $\bm{u}^0=0$, $\lambda^0=1$, $\eta_\lambda=5$, $\epsilon_{\rm infeas}=5.0\, 10^{-4}$, $\epsilon=10^{-3}$, and $H^k=10^3 I$. The running cost functions are selected as $\ell_i=v_i(t)^2+\omega_i(t)^2$, $\forall t\in\mathcal{I}$, and the terminal cost $V_{f,i}=0$.

Theorems & Definitions (4)

  • Definition 1
  • Proposition 1
  • Proposition 2
  • Theorem 1