Table of Contents
Fetching ...

SwarmGPT: Combining Large Language Models with Safe Motion Planning for Drone Swarm Choreography

Martin Schuck, Dinushka Orrin Dahanaggamaarachchi, Ben Sprenger, Vedant Vyas, Siqi Zhou, Angela P. Schoellig

TL;DR

SwarmGPT tackles the challenge of choreographing large drone swarms to music while ensuring safety and feasibility. It introduces a framework that couples an LLM-based choreographer with a distributed optimization-based safety filter to convert natural-language intents into safe, time-synchronized trajectories. The approach scales to hundreds of drones in simulation and tens in hardware, and supports iterative refinement via reprompting. This work demonstrates that foundation-model-guided planning can be safely integrated into safety-critical swarm robotics, enabling non-experts to design complex performances and informing future disaster-response and search-and-rescue applications.

Abstract

Drone swarm performances -- synchronized, expressive aerial displays set to music -- have emerged as a captivating application of modern robotics. Yet designing smooth, safe choreographies remains a complex task requiring expert knowledge. We present SwarmGPT, a language-based choreographer that leverages the reasoning power of large language models (LLMs) to streamline drone performance design. The LLM is augmented by a safety filter that ensures deployability by making minimal corrections when safety or feasibility constraints are violated. By decoupling high-level choreographic design from low-level motion planning, our system enables non-experts to iteratively refine choreographies using natural language without worrying about collisions or actuator limits. We validate our approach through simulations with swarms up to 200 drones and real-world experiments with up to 20 drones performing choreographies to diverse types of songs, demonstrating scalable, synchronized, and safe performances. Beyond entertainment, this work offers a blueprint for integrating foundation models into safety-critical swarm robotics applications.

SwarmGPT: Combining Large Language Models with Safe Motion Planning for Drone Swarm Choreography

TL;DR

SwarmGPT tackles the challenge of choreographing large drone swarms to music while ensuring safety and feasibility. It introduces a framework that couples an LLM-based choreographer with a distributed optimization-based safety filter to convert natural-language intents into safe, time-synchronized trajectories. The approach scales to hundreds of drones in simulation and tens in hardware, and supports iterative refinement via reprompting. This work demonstrates that foundation-model-guided planning can be safely integrated into safety-critical swarm robotics, enabling non-experts to design complex performances and informing future disaster-response and search-and-rescue applications.

Abstract

Drone swarm performances -- synchronized, expressive aerial displays set to music -- have emerged as a captivating application of modern robotics. Yet designing smooth, safe choreographies remains a complex task requiring expert knowledge. We present SwarmGPT, a language-based choreographer that leverages the reasoning power of large language models (LLMs) to streamline drone performance design. The LLM is augmented by a safety filter that ensures deployability by making minimal corrections when safety or feasibility constraints are violated. By decoupling high-level choreographic design from low-level motion planning, our system enables non-experts to iteratively refine choreographies using natural language without worrying about collisions or actuator limits. We validate our approach through simulations with swarms up to 200 drones and real-world experiments with up to 20 drones performing choreographies to diverse types of songs, demonstrating scalable, synchronized, and safe performances. Beyond entertainment, this work offers a blueprint for integrating foundation models into safety-critical swarm robotics applications.

Paper Structure

This paper contains 19 sections, 7 equations, 8 figures.

Figures (8)

  • Figure 1: SwarmGPT generates drone swarm choreographies from natural language, synchronized with music. Demonstration videos are available at https://tiny.cc/swarmgpt and in the supplementary materials (S1–S7). The project website is https://utiasdsl.github.io/swarm_GPT/.
  • Figure 2: Overview of the SwarmGPT system. (a) A high-level LLM designs unique choreographies, while a low-level optimization-based safety filter ensures feasible, collision-free deployment. (b) The drone performers are quadrotor vehicles. (c) A music processor extracts beat times and audio features from the song. (d) The LLM generates choreographies from the audio features and music beat times, producing raw drone trajectories. (e) The safety filter outputs safe, feasible trajectories for simulation or hardware deployment, enabling non-experts to iteratively refine via natural language.
  • Figure 3: Visualizations of two example motion primitives in the form of \ref{['eqn:primitive_form']}. The largest solid markers show drones’ current positions with trailing tails indicating motion over time; a single blue trajectory highlights one drone’s path. (a) Wave primitive mimicking a surface wave on a grid-configured swarm, with frequency and amplitude parameters adjustable by the LLM Du2018FastAI. (b) Spiral primitive rotating a circular swarm about the $z$-axis while gradually expanding its radius, configurable by the LLM.
  • Figure 4: Matching music to choreography characteristics. (a) Average speed of all drones in the swarm over 10 performances per song. The dark blue line depicts the songs' novelty function, and the dashed light-blue lines denote the beat times. The speed profile is distinct for each song and correlates with the beat times. (b) Average value of the spiral speed parameter over 25 performances for a slow and a fast song. Boxes denote the 10th and 90th quantiles, with the median in orange. The LLM uses faster speeds for the upbeat song, showing that the music analysis influences the design.
  • Figure 5: Examples of LLM swarm behaviour modification. In the user study, 10 participants modified an initial performance (blue) with the goal to increase speed, increase height, or reduce circular motion primitives. The choreography statistics (orange) show participants consistently achieved their goal, with white being the median, the dashed lines the respective mean values, and the arrows indicate the desired change.
  • ...and 3 more figures