Less is More: Robust Zero-Communication 3D Pursuit-Evasion via Representational Parsimony

Jialin Ying; Zhihao Li; Zicheng Dong; Guohua Wu; Yihuan Liao

Less is More: Robust Zero-Communication 3D Pursuit-Evasion via Representational Parsimony

Jialin Ying, Zhihao Li, Zicheng Dong, Guohua Wu, Yihuan Liao

Abstract

Asymmetric 3D pursuit-evasion in cluttered voxel environments is difficult under communication latency, partial observability, and nonholonomic maneuver limits. While many MARL methods rely on richer inter-agent coupling or centralized signals, these dependencies can become fragility sources when communication is delayed or noisy. Building on an inherited path-guided decentralized pursuit scaffold, we study a robustness-oriented question: can representational parsimony improve communication-free coordination? We instantiate this principle with (i) a parsimonious actor observation interface that removes team-coupled channels (83-D to 50-D), and (ii) Contribution-Gated Credit Assignment (CGCA), a locality-aware credit structure for communication-denied cooperation. In Stage-5 evaluation (4 pursuers vs. 1 evader), our configuration reaches 0.753 +/- 0.091 success and 0.223 +/- 0.066 collision, outperforming the 83-D FULL OBS counterpart (0.721 +/- 0.071, 0.253 +/- 0.089). It further shows graceful degradation under speed/yaw/noise/delay stress tests and resilient zero-shot transfer on urban-canyon maps (about 61% success at density 0.24). These results support a practical paradigm shift: explicitly severing redundant cross-agent channels can suppress compounding error cascades and improve robustness in latency-prone deployment.

Less is More: Robust Zero-Communication 3D Pursuit-Evasion via Representational Parsimony

Abstract

Paper Structure (23 sections, 15 equations, 9 figures, 3 tables)

This paper contains 23 sections, 15 equations, 9 figures, 3 tables.

Introduction
Related Work
A. 3D Pursuit--Evasion
B. Communication-Constrained MARL
C. Credit Assignment in Cooperative MARL
Problem Formulation and Environment Setup
Stage-5 environment
Asymmetric game model
Methodology
Inherited path-guided decentralized scaffold
Observation-space formalization: representational parsimony via 83-D to 50-D masking
Reward composition and Contribution-Gated Credit Assignment (CGCA)
Experiments and Evaluations
Setup and baselines
Curriculum maturation and visibility gating
...and 8 more sections

Figures (9)

Figure 1: Training reward curves under Stage 5. CTDE MAPPO shows marked instability after the visibility-gated transition, whereas OURS-LITE converges to the highest stable return regime.
Figure 2: Inherited path-guided decentralized pipeline pgf_mappo_2025. Our change is at the actor interface: 83-D full observation is masked to 50-D by removing teammate/slot/encirclement channels for zero-communication robustness.
Figure 3: Contribution-Gated Credit Assignment. Direction reward is fully active within 40 m, decays over 40--80 m, and vanishes beyond 80 m; capture-share is hard-gated beyond 60 m. This local credit geometry suppresses free-rider equilibria in zero-communication team pursuit.
Figure 4: Velocity sweep. OURS-LITE maintains the strongest success--collision trade-off under increasing evader speed, while EUCLIDEAN suffers persistent high-step inefficient pursuits.
Figure 5: Yaw-rate sweep. The proposed configuration preserves a robust mid-regime margin and shows graceful degradation as maneuver authority shrinks.
...and 4 more figures

Less is More: Robust Zero-Communication 3D Pursuit-Evasion via Representational Parsimony

Abstract

Less is More: Robust Zero-Communication 3D Pursuit-Evasion via Representational Parsimony

Authors

Abstract

Table of Contents

Figures (9)