Table of Contents
Fetching ...

Less is More: Robust Zero-Communication 3D Pursuit-Evasion via Representational Parsimony

Jialin Ying, Zhihao Li, Zicheng Dong, Guohua Wu, Yihuan Liao

Abstract

Asymmetric 3D pursuit-evasion in cluttered voxel environments is difficult under communication latency, partial observability, and nonholonomic maneuver limits. While many MARL methods rely on richer inter-agent coupling or centralized signals, these dependencies can become fragility sources when communication is delayed or noisy. Building on an inherited path-guided decentralized pursuit scaffold, we study a robustness-oriented question: can representational parsimony improve communication-free coordination? We instantiate this principle with (i) a parsimonious actor observation interface that removes team-coupled channels (83-D to 50-D), and (ii) Contribution-Gated Credit Assignment (CGCA), a locality-aware credit structure for communication-denied cooperation. In Stage-5 evaluation (4 pursuers vs. 1 evader), our configuration reaches 0.753 +/- 0.091 success and 0.223 +/- 0.066 collision, outperforming the 83-D FULL OBS counterpart (0.721 +/- 0.071, 0.253 +/- 0.089). It further shows graceful degradation under speed/yaw/noise/delay stress tests and resilient zero-shot transfer on urban-canyon maps (about 61% success at density 0.24). These results support a practical paradigm shift: explicitly severing redundant cross-agent channels can suppress compounding error cascades and improve robustness in latency-prone deployment.

Less is More: Robust Zero-Communication 3D Pursuit-Evasion via Representational Parsimony

Abstract

Asymmetric 3D pursuit-evasion in cluttered voxel environments is difficult under communication latency, partial observability, and nonholonomic maneuver limits. While many MARL methods rely on richer inter-agent coupling or centralized signals, these dependencies can become fragility sources when communication is delayed or noisy. Building on an inherited path-guided decentralized pursuit scaffold, we study a robustness-oriented question: can representational parsimony improve communication-free coordination? We instantiate this principle with (i) a parsimonious actor observation interface that removes team-coupled channels (83-D to 50-D), and (ii) Contribution-Gated Credit Assignment (CGCA), a locality-aware credit structure for communication-denied cooperation. In Stage-5 evaluation (4 pursuers vs. 1 evader), our configuration reaches 0.753 +/- 0.091 success and 0.223 +/- 0.066 collision, outperforming the 83-D FULL OBS counterpart (0.721 +/- 0.071, 0.253 +/- 0.089). It further shows graceful degradation under speed/yaw/noise/delay stress tests and resilient zero-shot transfer on urban-canyon maps (about 61% success at density 0.24). These results support a practical paradigm shift: explicitly severing redundant cross-agent channels can suppress compounding error cascades and improve robustness in latency-prone deployment.
Paper Structure (23 sections, 15 equations, 9 figures, 3 tables)

This paper contains 23 sections, 15 equations, 9 figures, 3 tables.

Figures (9)

  • Figure 1: Training reward curves under Stage 5. CTDE MAPPO shows marked instability after the visibility-gated transition, whereas OURS-LITE converges to the highest stable return regime.
  • Figure 2: Inherited path-guided decentralized pipeline pgf_mappo_2025. Our change is at the actor interface: 83-D full observation is masked to 50-D by removing teammate/slot/encirclement channels for zero-communication robustness.
  • Figure 3: Contribution-Gated Credit Assignment. Direction reward is fully active within 40 m, decays over 40--80 m, and vanishes beyond 80 m; capture-share is hard-gated beyond 60 m. This local credit geometry suppresses free-rider equilibria in zero-communication team pursuit.
  • Figure 4: Velocity sweep. OURS-LITE maintains the strongest success--collision trade-off under increasing evader speed, while EUCLIDEAN suffers persistent high-step inefficient pursuits.
  • Figure 5: Yaw-rate sweep. The proposed configuration preserves a robust mid-regime margin and shows graceful degradation as maneuver authority shrinks.
  • ...and 4 more figures