Table of Contents
Fetching ...

Building Large-Scale Drone Defenses from Small-Team Strategies

Grant Douglas, Stephen Franklin, Claudia Szabo, Mingyu Guo

TL;DR

This paper tackles scaling defender coordination against adversarial drone swarms by reusing small-team heuristics as modular components within a staged GA–DP framework. By encoding hierarchical policies as chromosomes and applying DP-guided allocations, the approach scales from small to large swarms without exhaustive search. The method integrates hand-crafted and LLM-generated heuristics and uses iterative refinement to converge on robust, high-performing defense strategies that outperform baselines. The work demonstrates practical scalability and provides insights into resource-efficient deployment, though it assumes full observability and deterministic dynamics, suggesting directions for future work in partial observability and learning-based adaptation at lower levels.

Abstract

Defending against large adversarial drone swarms requires coordination methods that scale effectively beyond conventional multi-agent optimisation. In this paper, we propose to scale strategies proven effective in small defender teams by integrating them as modular components of larger forces using our proposed framework. A dynamic programming (DP) decomposition assembles these components into large teams in polynomial time, enabling efficient construction of scalable defenses without exhaustive evaluation. Because a unit that is strong in isolation may not remain strong when combined, we sample across multiple small-team candidates. Our framework iterates between evaluating large-team outcomes and refining the pool of modular components, allowing convergence on increasingly effective strategies. Experiments demonstrate that this partitioning approach scales to substantially larger scenarios while preserving effectiveness and revealing cooperative behaviours that direct optimisation cannot reliably discover.

Building Large-Scale Drone Defenses from Small-Team Strategies

TL;DR

This paper tackles scaling defender coordination against adversarial drone swarms by reusing small-team heuristics as modular components within a staged GA–DP framework. By encoding hierarchical policies as chromosomes and applying DP-guided allocations, the approach scales from small to large swarms without exhaustive search. The method integrates hand-crafted and LLM-generated heuristics and uses iterative refinement to converge on robust, high-performing defense strategies that outperform baselines. The work demonstrates practical scalability and provides insights into resource-efficient deployment, though it assumes full observability and deterministic dynamics, suggesting directions for future work in partial observability and learning-based adaptation at lower levels.

Abstract

Defending against large adversarial drone swarms requires coordination methods that scale effectively beyond conventional multi-agent optimisation. In this paper, we propose to scale strategies proven effective in small defender teams by integrating them as modular components of larger forces using our proposed framework. A dynamic programming (DP) decomposition assembles these components into large teams in polynomial time, enabling efficient construction of scalable defenses without exhaustive evaluation. Because a unit that is strong in isolation may not remain strong when combined, we sample across multiple small-team candidates. Our framework iterates between evaluating large-team outcomes and refining the pool of modular components, allowing convergence on increasingly effective strategies. Experiments demonstrate that this partitioning approach scales to substantially larger scenarios while preserving effectiveness and revealing cooperative behaviours that direct optimisation cannot reliably discover.
Paper Structure (40 sections, 1 theorem, 19 equations, 8 figures, 2 tables, 1 algorithm)

This paper contains 40 sections, 1 theorem, 19 equations, 8 figures, 2 tables, 1 algorithm.

Key Result

Proposition 1

Let $R$ and $B$ denote the number of red attackers and blue defenders, respectively, and let $k$ denote the maximum subgroup size of red agents for which precomputed outcomes are available. Then Algorithm alg:dp-allocation has worst-case time complexity which is polynomial in $R$ and $B$ for fixed $k$.

Figures (8)

  • Figure 1: Scenario with 10 defenders (blue) protecting a target (green) from 8 attackers (red). In (a) attackers are shown with their offensive paths (orange) toward the target. In (b) collided drones that no longer participate have muted coloring.
  • Figure 2: Overall hybrid GA–DP framework. The four-stage pipeline (Stage 1: GA evolution, Stage 2: dynamic programming allocation, Stage 3: chromosome sampling, Stage 4: refinement) operates over a hierarchical policy structure assigning heuristics to agents. These heuristics drive all decision-making. The simulator follows a standard MDP cycle, but no reinforcement learning is employed—coordination arises entirely from heuristic assignment and GA–DP optimisation.
  • Figure 3: Heatmaps showing mean population win rates for the top-performing 10 percentile and the top-performing individual chromosome, both before (a) and after (b) application of GA generations. The latter, demonstrating significantly improved win rates for small scale swarm-defense scenarios, are utilised in subsequent stages of the approach.
  • Figure 4: Ablation study of defender performance across Red swarm sizes and Blue-to-Red ratios. Columns show Blue/Red ratios and rows show the number of Red attackers. Shading indicates win rate, with lighter colors representing higher performance. For each subfigure, the top row shows the mean win rate of the top 10% of chromosomes and the bottom row the best-performing chromosome. Stage 3 demonstrates the benefit of combining DP allocation with sampling, while Stage 4 illustrates the additional gains from iterative refinement, achieving the highest and most consistent win rates across all scenarios.
  • Figure 5: Network graphs of heuristic co-occurrence (edge thickness proportional to co-occurrence frequency). GA alone can distil effective structures in small-scale cases (a), but fails to simplify when scaled up (b). Only the full pipeline (c) restores parsimony, converging on a compact set of synergistic heuristics.
  • ...and 3 more figures

Theorems & Definitions (2)

  • Proposition 1
  • Remark 1