Table of Contents
Fetching ...

SwarmDiff: Swarm Robotic Trajectory Planning in Cluttered Environments via Diffusion Transformer

Kang Ding, Chunxuan Jiao, Yunze Hu, Kangjie Zhou, Pengying Wu, Yao Mu, Chang Liu

TL;DR

This work tackles scalable trajectory planning for large swarms in obstacle-dense environments where traditional methods struggle with cost and safety. It proposes SwarmDiff, a hierarchical framework that models the swarm as a time-varying PDF $\chi(x,t)$ represented as a Gaussian Mixture $\chi(x,t)=\sum_{j=1}^{N_k} \omega_j^k g_j^k$ and uses a Diffusion Transformer to generate a macroscopic path, subsequently deriving microscopic control through density mapping and distributed MPC. The method couples diffusion-based sampling with optimal transport via a cost-gradient guidance framework that includes CVaR for collision risk, Wasserstein distance for transport, and a Gaussian Process cost for smoothness, plus an LP to fuse DiT outputs into a globally consistent GMM trajectory. Extensive simulations and real-world experiments with 10 robots demonstrate improved computational efficiency, trajectory validity, and scalability compared with baselines, highlighting SwarmDiff’s practical applicability to large-scale swarm coordination.

Abstract

Swarm robotic trajectory planning faces challenges in computational efficiency, scalability, and safety, particularly in complex, obstacle-dense environments. To address these issues, we propose SwarmDiff, a hierarchical and scalable generative framework for swarm robots. We model the swarm's macroscopic state using Probability Density Functions (PDFs) and leverage conditional diffusion models to generate risk-aware macroscopic trajectory distributions, which then guide the generation of individual robot trajectories at the microscopic level. To ensure a balance between the swarm's optimal transportation and risk awareness, we integrate Wasserstein metrics and Conditional Value at Risk (CVaR). Additionally, we introduce a Diffusion Transformer (DiT) to improve sampling efficiency and generation quality by capturing long-range dependencies. Extensive simulations and real-world experiments demonstrate that SwarmDiff outperforms existing methods in computational efficiency, trajectory validity, and scalability, making it a reliable solution for swarm robotic trajectory planning.

SwarmDiff: Swarm Robotic Trajectory Planning in Cluttered Environments via Diffusion Transformer

TL;DR

This work tackles scalable trajectory planning for large swarms in obstacle-dense environments where traditional methods struggle with cost and safety. It proposes SwarmDiff, a hierarchical framework that models the swarm as a time-varying PDF represented as a Gaussian Mixture and uses a Diffusion Transformer to generate a macroscopic path, subsequently deriving microscopic control through density mapping and distributed MPC. The method couples diffusion-based sampling with optimal transport via a cost-gradient guidance framework that includes CVaR for collision risk, Wasserstein distance for transport, and a Gaussian Process cost for smoothness, plus an LP to fuse DiT outputs into a globally consistent GMM trajectory. Extensive simulations and real-world experiments with 10 robots demonstrate improved computational efficiency, trajectory validity, and scalability compared with baselines, highlighting SwarmDiff’s practical applicability to large-scale swarm coordination.

Abstract

Swarm robotic trajectory planning faces challenges in computational efficiency, scalability, and safety, particularly in complex, obstacle-dense environments. To address these issues, we propose SwarmDiff, a hierarchical and scalable generative framework for swarm robots. We model the swarm's macroscopic state using Probability Density Functions (PDFs) and leverage conditional diffusion models to generate risk-aware macroscopic trajectory distributions, which then guide the generation of individual robot trajectories at the microscopic level. To ensure a balance between the swarm's optimal transportation and risk awareness, we integrate Wasserstein metrics and Conditional Value at Risk (CVaR). Additionally, we introduce a Diffusion Transformer (DiT) to improve sampling efficiency and generation quality by capturing long-range dependencies. Extensive simulations and real-world experiments demonstrate that SwarmDiff outperforms existing methods in computational efficiency, trajectory validity, and scalability, making it a reliable solution for swarm robotic trajectory planning.

Paper Structure

This paper contains 19 sections, 11 equations, 4 figures, 3 tables, 1 algorithm.

Figures (4)

  • Figure 1: Motion planning experiment with ten robots in a real-world obstacle-dense environment. The figure illustrates the robots' trajectories over five selected time steps, where each blue or red cluster represents the robot formations at a specific moment. The robots dynamically split and merge to navigate through the obstacles efficiently.
  • Figure 2: SwarmDiff overview. Environment Input: the Scene Encoder extracts obstacles map features and pre-calculated Euclidean Signed Distance Field (ESDF) as conditional values, and the Context Encoder processes the start and target distributions to provide contextual information. Macroscopic Planning: In the bottom left (center), a Diffusion Transformer iteratively refines an initial noisy Gaussian trajectory $\xi_T$ through a denoising process guided by cost gradients for task-oriented. This process results in an optimized Gaussian trajectory $\xi_0$, which is further refined via optimal transport to generate a GMM-based trajectory. The details of the DiT of SwarmDiff and the denoising process are illustrated in the middle. Microscopic Control: individual reference trajectories are derived from the GMM trajectory by the Density Control method, and distributed MPC ensures coordinated swarm motion.
  • Figure 3: Trajectory comparison in environments I and II. Trajectories of (a) SwarmDiff, (b) SwarmPRM, and (c) FC with $N = 500$ robots from the same initial positions, and (d) dRRT* with 50 robots sampled from the same distribution due to computational limits. Initial and final positions are colored circles, obstacles are black, and collisions (in dRRT*) are red.
  • Figure 4: Experimental results of 10 robots in real-world environments. The top row shows real-world scenarios: Scene Scatter (left) and Scene Maze (right). The bottom row presents visualizations of real experimental data.