Table of Contents
Fetching ...

Flow-Opt: Scalable Centralized Multi-Robot Trajectory Optimization with Flow Matching and Differentiable Optimization

Simon Idoko, Arun Kumar Singh

TL;DR

Flow-Opt tackles the challenge of scalable centralized multi-robot trajectory optimization by marrying a context-aware flow-matching model with a differentiable, learnable Safety Filter. The approach uses a Diffusion Transformer backbone plus permutation-invariant encoders to generate diverse candidate trajectories and then refines them with a GPU-accelerated, warm-started fixed-point solver, achieving tens-of-robot planning in tens of milliseconds and enabling batch processing of many problems in parallel. Key contributions include the first application of flow matching to multi-robot planning, a differentiable and learnable inference-time refinement (with initialization) to ensure constraint satisfaction, and extensive empirical validation showing substantial speedups and smoother trajectories compared to diffusion-based and batch-sequential baselines. The work has practical impact for real-time, centralized planning in cluttered environments (e.g., warehouses) and offers a foundation for data-driven simulation and training of navigation policies.

Abstract

Centralized trajectory optimization in the joint space of multiple robots allows access to a larger feasible space that can result in smoother trajectories, especially while planning in tight spaces. Unfortunately, it is often computationally intractable beyond a very small swarm size. In this paper, we propose Flow-Opt, a learning-based approach towards improving the computational tractability of centralized multi-robot trajectory optimization. Specifically, we reduce the problem to first learning a generative model to sample different candidate trajectories and then using a learned Safety-Filter(SF) to ensure fast inference-time constraint satisfaction. We propose a flow-matching model with a diffusion transformer (DiT) augmented with permutation invariant robot position and map encoders as the generative model. We develop a custom solver for our SF and equip it with a neural network that predicts context-specific initialization. The initialization network is trained in a self-supervised manner, taking advantage of the differentiability of the SF solver. We advance the state-of-the-art in the following respects. First, we show that we can generate trajectories of tens of robots in cluttered environments in a few tens of milliseconds. This is several times faster than existing centralized optimization approaches. Moreover, our approach also generates smoother trajectories orders of magnitude faster than competing baselines based on diffusion models. Second, each component of our approach can be batched, allowing us to solve a few tens of problem instances in a fraction of a second. We believe this is a first such result; no existing approach provides such capabilities. Finally, our approach can generate a diverse set of trajectories between a given set of start and goal locations, which can capture different collision-avoidance behaviors.

Flow-Opt: Scalable Centralized Multi-Robot Trajectory Optimization with Flow Matching and Differentiable Optimization

TL;DR

Flow-Opt tackles the challenge of scalable centralized multi-robot trajectory optimization by marrying a context-aware flow-matching model with a differentiable, learnable Safety Filter. The approach uses a Diffusion Transformer backbone plus permutation-invariant encoders to generate diverse candidate trajectories and then refines them with a GPU-accelerated, warm-started fixed-point solver, achieving tens-of-robot planning in tens of milliseconds and enabling batch processing of many problems in parallel. Key contributions include the first application of flow matching to multi-robot planning, a differentiable and learnable inference-time refinement (with initialization) to ensure constraint satisfaction, and extensive empirical validation showing substantial speedups and smoother trajectories compared to diffusion-based and batch-sequential baselines. The work has practical impact for real-time, centralized planning in cluttered environments (e.g., warehouses) and offers a foundation for data-driven simulation and training of navigation policies.

Abstract

Centralized trajectory optimization in the joint space of multiple robots allows access to a larger feasible space that can result in smoother trajectories, especially while planning in tight spaces. Unfortunately, it is often computationally intractable beyond a very small swarm size. In this paper, we propose Flow-Opt, a learning-based approach towards improving the computational tractability of centralized multi-robot trajectory optimization. Specifically, we reduce the problem to first learning a generative model to sample different candidate trajectories and then using a learned Safety-Filter(SF) to ensure fast inference-time constraint satisfaction. We propose a flow-matching model with a diffusion transformer (DiT) augmented with permutation invariant robot position and map encoders as the generative model. We develop a custom solver for our SF and equip it with a neural network that predicts context-specific initialization. The initialization network is trained in a self-supervised manner, taking advantage of the differentiability of the SF solver. We advance the state-of-the-art in the following respects. First, we show that we can generate trajectories of tens of robots in cluttered environments in a few tens of milliseconds. This is several times faster than existing centralized optimization approaches. Moreover, our approach also generates smoother trajectories orders of magnitude faster than competing baselines based on diffusion models. Second, each component of our approach can be batched, allowing us to solve a few tens of problem instances in a fraction of a second. We believe this is a first such result; no existing approach provides such capabilities. Finally, our approach can generate a diverse set of trajectories between a given set of start and goal locations, which can capture different collision-avoidance behaviors.

Paper Structure

This paper contains 35 sections, 27 equations, 15 figures, 7 tables.

Figures (15)

  • Figure 1: The overview of our multi-robot trajectory pipeline. It has two core components: a trained flow policy and a safety-filter. The trained flow policy takes in start and goal positions of the robots and static obstacle placements (additionally, velocity for dynamic obstacles) and outputs a distribution of trajectories, $\boldsymbol{\xi}$. The multiple flow sampled trajectories are refined in parallel through a safety-filter, and the trajectory with the lowest constraint residual and smoothness is output as the optimal solution. Our SF is accelerated by an initialization network that is conditioned on the samples drawn from the flow policy
  • Figure 2: Architecture of our Flow Matching Network. Start-goal and obstacle encoders are based on PointNet++ to ensure invariance to the shuffling of obstacle coordinates and start and goal pairs.
  • Figure 3: Process of denoising random noise to feasible multi-robot trajectories through a flow policy.
  • Figure 4: The training pipeline for learning warm-start for the SF solver. The architecture has two components: a learnable part consisting of a start-goal and obstacle encoder and transformer-based network, and a fixed part that resembles $L$ fixed-point iterations of our SF solver. The loss function is simply made of a fixed-point residual. During training, the gradients flow through the fixed-point layer, ensuring the initialization $({^0}\overline{\boldsymbol{\xi}}, {^0}\overline{\boldsymbol{\lambda}} )$ produced by the networks is used by the downstream SF solver. This in turn requires that the fixed-point iterations are differentiable.
  • Figure 5: The three stages of our trajectory planning pipeline. We first sample a large number of trajectories ($\approx 256$) from the trained flow policy (Fig.(a), (d)). We then sort the sampled trajectories based on constraint satisfactions and choose the top 10 with the lowest residual (Fig.(b)-(e)). Finally, these trajectories are refined through SF, and the feasible trajectory with the lowest smoothness cost is output as the optimal solution (Fig.(c)-(f)).
  • ...and 10 more figures

Theorems & Definitions (1)

  • Remark 1