Gen-Swarms: Adapting Deep Generative Models to Swarms of Drones
Carlos Plou, Pablo Pueyo, Ruben Martinez-Cantin, Mac Schwager, Ana C. Murillo, Eduardo Montijano
TL;DR
Gen-Swarms addresses the challenge of automating drone show generation by uniting deep generative models with motion-constrained planning. It introduces a conditional flow-matching framework that is conditioned on a latent shape representation produced by a PointNet encoder and guided by a trainable affine-coupling bijector, enabling simultaneous generation of a final 3D point cloud and the corresponding swarm trajectories. A reactive navigation layer (ORCA) is integrated into sampling to enforce collision avoidance and smoothness, resulting in feasible drone trajectories while preserving high-quality shape reconstructions. Quantitative and qualitative results on ShapeNet Airplane show Gen-Swarms outperforms diffusion and plain flow-based baselines in collision avoidance, trajectory smoothness, and energy efficiency, while maintaining competitive shape fidelity, indicating the approach's practical potential for scalable autonomous aerial shows. The method opens avenues for color-augmented, open-vocabulary, text-driven drone displays and highlights future work on improving transition between shapes and incorporating additional sensory attributes.
Abstract
Gen-Swarms is an innovative method that leverages and combines the capabilities of deep generative models with reactive navigation algorithms to automate the creation of drone shows. Advancements in deep generative models, particularly diffusion models, have demonstrated remarkable effectiveness in generating high-quality 2D images. Building on this success, various works have extended diffusion models to 3D point cloud generation. In contrast, alternative generative models such as flow matching have been proposed, offering a simple and intuitive transition from noise to meaningful outputs. However, the application of flow matching models to 3D point cloud generation remains largely unexplored. Gen-Swarms adapts these models to automatically generate drone shows. Existing 3D point cloud generative models create point trajectories which are impractical for drone swarms. In contrast, our method not only generates accurate 3D shapes but also guides the swarm motion, producing smooth trajectories and accounting for potential collisions through a reactive navigation algorithm incorporated into the sampling process. For example, when given a text category like Airplane, Gen-Swarms can rapidly and continuously generate numerous variations of 3D airplane shapes. Our experiments demonstrate that this approach is particularly well-suited for drone shows, providing feasible trajectories, creating representative final shapes, and significantly enhancing the overall performance of drone show generation.
