Table of Contents
Fetching ...

Rule-Based Conflict-Free Decision Framework in Swarm Confrontation

Zhaoqi Dong, Zhinan Wang, Quanqi Zheng, Bin Xu, Lei Chen, Jinhu Lv

TL;DR

The paper tackles jitter and deadlock in rule-based swarm decision-making by introducing a probabilistic finite state machine (PFSM) framework that governs transitions with a learned transition matrix $\boldsymbol{P}$ in a 2D adversarial swarm setting governed by double-integrator dynamics. It fuses a multistream deep convolutional network, comprising AgentNet, TeammateNet, and EnemyNet, to produce the PFSM transition probabilities via $\boldsymbol{P}=f_{\Lambda}(\boldsymbol{z})$, and optimizes these transitions with a PPO-based Actor-Critic under a carefully designed reward that penalizes deadlock and jitter. Key contributions include formal PFSM integration with neural architectures, a relational attention mechanism for teammate interactions, and a learning objective that stabilizes state transitions through sparsity and consistency terms in $L^{\text{CLIP}}(\theta)$. Experimental validation in both simulations and real unmanned ground vehicles demonstrates superior rewards and high win rates, with evidence of scalable performance as swarm size grows. Overall, the framework provides interpretable yet adaptive decision-making for robust swarm confrontation, with demonstrated potential for real-world deployment and transfer to larger, more complex multi-agent systems.

Abstract

Traditional rule-based decision-making methods with interpretable advantage, such as finite state machine, suffer from the jitter or deadlock(JoD) problems in extremely dynamic scenarios. To realize agent swarm confrontation, decision conflicts causing many JoD problems are a key issue to be solved. Here, we propose a novel decision-making framework that integrates probabilistic finite state machine, deep convolutional networks, and reinforcement learning to implement interpretable intelligence into agents. Our framework overcomes state machine instability and JoD problems, ensuring reliable and adaptable decisions in swarm confrontation. The proposed approach demonstrates effective performance via enhanced human-like cooperation and competitive strategies in the rigorous evaluation of real experiments, outperforming other methods.

Rule-Based Conflict-Free Decision Framework in Swarm Confrontation

TL;DR

The paper tackles jitter and deadlock in rule-based swarm decision-making by introducing a probabilistic finite state machine (PFSM) framework that governs transitions with a learned transition matrix in a 2D adversarial swarm setting governed by double-integrator dynamics. It fuses a multistream deep convolutional network, comprising AgentNet, TeammateNet, and EnemyNet, to produce the PFSM transition probabilities via , and optimizes these transitions with a PPO-based Actor-Critic under a carefully designed reward that penalizes deadlock and jitter. Key contributions include formal PFSM integration with neural architectures, a relational attention mechanism for teammate interactions, and a learning objective that stabilizes state transitions through sparsity and consistency terms in . Experimental validation in both simulations and real unmanned ground vehicles demonstrates superior rewards and high win rates, with evidence of scalable performance as swarm size grows. Overall, the framework provides interpretable yet adaptive decision-making for robust swarm confrontation, with demonstrated potential for real-world deployment and transfer to larger, more complex multi-agent systems.

Abstract

Traditional rule-based decision-making methods with interpretable advantage, such as finite state machine, suffer from the jitter or deadlock(JoD) problems in extremely dynamic scenarios. To realize agent swarm confrontation, decision conflicts causing many JoD problems are a key issue to be solved. Here, we propose a novel decision-making framework that integrates probabilistic finite state machine, deep convolutional networks, and reinforcement learning to implement interpretable intelligence into agents. Our framework overcomes state machine instability and JoD problems, ensuring reliable and adaptable decisions in swarm confrontation. The proposed approach demonstrates effective performance via enhanced human-like cooperation and competitive strategies in the rigorous evaluation of real experiments, outperforming other methods.

Paper Structure

This paper contains 12 sections, 22 equations, 10 figures, 3 tables.

Figures (10)

  • Figure 1: Two teams of agents, marked by red and blue colors are battling in a environment with obstacles. The colored shallow circles represent the detect range facilitated by the detectors. The agent reveals the other agents if they enter its detect circle, denoted with the exclamation point. Once an agent is detected by other agents, it may be destroyed by the missiles of the others, such as the agents marked with the fire symbols.
  • Figure 2: The model represents the agent and its sensor system within a two-dimensional coordinate framework. $O$ marks the center of agent, which serves as the origin of the $X$-$Y$ coordinate system. The blue circular area $S_o$ represents the perception range, while $S_a$ denotes the attack range aligned with the velocity vector $\boldsymbol{v}$.
  • Figure 3: Decision--making of the single agent includes three steps: rule-driven decision-making, composite convolutional network, and reinforcement learning network. Each step is equipped with detailed subcomponents and descriptions, reflecting a structured intelligent decision--making system architecture.
  • Figure 4: The agent state transition model includes five main states: SupportState, SearchState, TrackState, EscapeState, and CooperateState. The arrows in the diagram represent the transition relationships between states. For instance, an arrow from state $i$ to state $j$ indicates the transition from state $i$ to state $j$. All state transitions strictly follow these rules.
  • Figure 5: The network is designed to calculate the transition probability matrix $\boldsymbol{P}$ for multi-agent systems by processing information from agents, teammates, and enemies. Agent information includes speed, position, and state data, which are input into convolutional layers followed by max-pooling, fully connected (FC) layers, and an LSTM layer to capture sequential dependencies. Outputs from these three information streams are concatenated and normalized using a softmax layer. The combined features are then reshaped and further processed through additional convolutional and FC layers to generate the final transition probability matrix $\boldsymbol{P}$, which defines the probabilistic state transitions for the agents.
  • ...and 5 more figures