Prism: Spectral Parameter Sharing for Multi-Agent Reinforcement Learning
Kyungbeom Kim, Seungwon Oh, Kyung-Joong Kim
TL;DR
Prism tackles scalability and policy heterogeneity in multi-agent reinforcement learning by learning in the spectral domain. It factorizes the shared weight matrix as $W = U \,\mathrm{diag}(s) \\mathbf{V}^T$ with common $U$ and $V$ across agents and agent-specific spectral masks on $s$, enabling diverse yet compact policies. The approach includes diversity and orthogonal regularization and demonstrates competitive or superior results to baselines on homogeneous (LBF, SMACv2) and heterogeneous (MaMuJoCo) tasks while reducing memory overhead. The work highlights that spectral-space sharing balances expressiveness and efficiency, particularly under resource constraints.
Abstract
Parameter sharing is a key strategy in multi-agent reinforcement learning (MARL) for improving scalability, yet conventional fully shared architectures often collapse into homogeneous behaviors. Recent methods introduce diversity through clustering, pruning, or masking, but typically compromise resource efficiency. We propose Prism, a parameter sharing framework that induces inter-agent diversity by representing shared networks in the spectral domain via singular value decomposition (SVD). All agents share the singular vector directions while learning distinct spectral masks on singular values. This mechanism encourages inter-agent diversity and preserves scalability. Extensive experiments on both homogeneous (LBF, SMACv2) and heterogeneous (MaMuJoCo) benchmarks show that Prism achieves competitive performance with superior resource efficiency.
