Table of Contents
Fetching ...

Gen-Swarms: Adapting Deep Generative Models to Swarms of Drones

Carlos Plou, Pablo Pueyo, Ruben Martinez-Cantin, Mac Schwager, Ana C. Murillo, Eduardo Montijano

TL;DR

Gen-Swarms addresses the challenge of automating drone show generation by uniting deep generative models with motion-constrained planning. It introduces a conditional flow-matching framework that is conditioned on a latent shape representation produced by a PointNet encoder and guided by a trainable affine-coupling bijector, enabling simultaneous generation of a final 3D point cloud and the corresponding swarm trajectories. A reactive navigation layer (ORCA) is integrated into sampling to enforce collision avoidance and smoothness, resulting in feasible drone trajectories while preserving high-quality shape reconstructions. Quantitative and qualitative results on ShapeNet Airplane show Gen-Swarms outperforms diffusion and plain flow-based baselines in collision avoidance, trajectory smoothness, and energy efficiency, while maintaining competitive shape fidelity, indicating the approach's practical potential for scalable autonomous aerial shows. The method opens avenues for color-augmented, open-vocabulary, text-driven drone displays and highlights future work on improving transition between shapes and incorporating additional sensory attributes.

Abstract

Gen-Swarms is an innovative method that leverages and combines the capabilities of deep generative models with reactive navigation algorithms to automate the creation of drone shows. Advancements in deep generative models, particularly diffusion models, have demonstrated remarkable effectiveness in generating high-quality 2D images. Building on this success, various works have extended diffusion models to 3D point cloud generation. In contrast, alternative generative models such as flow matching have been proposed, offering a simple and intuitive transition from noise to meaningful outputs. However, the application of flow matching models to 3D point cloud generation remains largely unexplored. Gen-Swarms adapts these models to automatically generate drone shows. Existing 3D point cloud generative models create point trajectories which are impractical for drone swarms. In contrast, our method not only generates accurate 3D shapes but also guides the swarm motion, producing smooth trajectories and accounting for potential collisions through a reactive navigation algorithm incorporated into the sampling process. For example, when given a text category like Airplane, Gen-Swarms can rapidly and continuously generate numerous variations of 3D airplane shapes. Our experiments demonstrate that this approach is particularly well-suited for drone shows, providing feasible trajectories, creating representative final shapes, and significantly enhancing the overall performance of drone show generation.

Gen-Swarms: Adapting Deep Generative Models to Swarms of Drones

TL;DR

Gen-Swarms addresses the challenge of automating drone show generation by uniting deep generative models with motion-constrained planning. It introduces a conditional flow-matching framework that is conditioned on a latent shape representation produced by a PointNet encoder and guided by a trainable affine-coupling bijector, enabling simultaneous generation of a final 3D point cloud and the corresponding swarm trajectories. A reactive navigation layer (ORCA) is integrated into sampling to enforce collision avoidance and smoothness, resulting in feasible drone trajectories while preserving high-quality shape reconstructions. Quantitative and qualitative results on ShapeNet Airplane show Gen-Swarms outperforms diffusion and plain flow-based baselines in collision avoidance, trajectory smoothness, and energy efficiency, while maintaining competitive shape fidelity, indicating the approach's practical potential for scalable autonomous aerial shows. The method opens avenues for color-augmented, open-vocabulary, text-driven drone displays and highlights future work on improving transition between shapes and incorporating additional sensory attributes.

Abstract

Gen-Swarms is an innovative method that leverages and combines the capabilities of deep generative models with reactive navigation algorithms to automate the creation of drone shows. Advancements in deep generative models, particularly diffusion models, have demonstrated remarkable effectiveness in generating high-quality 2D images. Building on this success, various works have extended diffusion models to 3D point cloud generation. In contrast, alternative generative models such as flow matching have been proposed, offering a simple and intuitive transition from noise to meaningful outputs. However, the application of flow matching models to 3D point cloud generation remains largely unexplored. Gen-Swarms adapts these models to automatically generate drone shows. Existing 3D point cloud generative models create point trajectories which are impractical for drone swarms. In contrast, our method not only generates accurate 3D shapes but also guides the swarm motion, producing smooth trajectories and accounting for potential collisions through a reactive navigation algorithm incorporated into the sampling process. For example, when given a text category like Airplane, Gen-Swarms can rapidly and continuously generate numerous variations of 3D airplane shapes. Our experiments demonstrate that this approach is particularly well-suited for drone shows, providing feasible trajectories, creating representative final shapes, and significantly enhancing the overall performance of drone show generation.
Paper Structure (27 sections, 13 equations, 7 figures, 2 tables, 2 algorithms)

This paper contains 27 sections, 13 equations, 7 figures, 2 tables, 2 algorithms.

Figures (7)

  • Figure 1: Illustration of Gen-Swarms, a novel 3D point cloud generative model adapted to handle motion constraints in physical swarms. Unlike the nearest current approach (Diffusionluo2021diffusion), Gen-Swarms produces smooth trajectories (zoomed in the middle) free of collisions (red points indicates collisions).
  • Figure 2: Illustration of Gen-Swarms. On the left, we show one iteration of the training algorithm. Starting (blue) from the sampled point clouds, we derive intermediate calculations (green), to end (red) computing the loss $\mathcal{L}_{\theta, \varphi, \alpha}$. This loss is then used to perform backpropagation through our three neural networks (yellow). On the right, the sampling algorithm generates safe and smooth trajectories from a random initial point cloud towards an accurate 3D shape guided by the shape latent $\mathbf{z}$.
  • Figure 3: Impact of $\Delta t$ in the Gen-Swarms quality reconstruction. The final 3D shapes generated by Gen-Swarms for the different number of steps highlight the convergence at $\Delta t=0.04s.$ (25 steps).
  • Figure 4: Trajectories. Comparison of the generated trajectories of the drones using Gen-Swarms (top row) versus Diffusion (bottom row). The first three plots show the evolution of each component ($x, y, z$ respectively) of the 3D trajectories of the drones, each drone showed with a different color. The fourth column displays the combined 3D trajectories.
  • Figure 5: Final shapes and collisions. Final 3D point cloud generated by each method. Drones marked in red represent a violation of the security threshold, leading to a collision.
  • ...and 2 more figures