Learning to Balance Altruism and Self-interest Based on Empathy in Mixed-Motive Games
Fanqi Kong, Yizhe Huang, Song-Chun Zhu, Siyuan Qi, Xue Feng
TL;DR
The paper tackles mixed-motive multi-agent reinforcement learning by introducing LASE, a decentralized algorithm that balances altruism and self-interest through empathy-based gifting. LASE infers social relationships with co-players via counterfactual reasoning against a perspective-taking module, and uses a zero-sum gifting scheme to modulate reward sharing, guiding policies toward cooperative behavior while mitigating exploitation. The authors provide theoretical analysis in iterated matrix games and validate the approach across IPD and four sequential social dilemmas, showing improved cooperation, fairness, and adaptability to diverse co-players. This work offers a scalable framework for empathy-informed decision-making in decentralized multi-agent systems with practical implications for autonomous collaboration and negotiation settings.
Abstract
Real-world multi-agent scenarios often involve mixed motives, demanding altruistic agents capable of self-protection against potential exploitation. However, existing approaches often struggle to achieve both objectives. In this paper, based on that empathic responses are modulated by inferred social relationships between agents, we propose LASE Learning to balance Altruism and Self-interest based on Empathy), a distributed multi-agent reinforcement learning algorithm that fosters altruistic cooperation through gifting while avoiding exploitation by other agents in mixed-motive games. LASE allocates a portion of its rewards to co-players as gifts, with this allocation adapting dynamically based on the social relationship -- a metric evaluating the friendliness of co-players estimated by counterfactual reasoning. In particular, social relationship measures each co-player by comparing the estimated $Q$-function of current joint action to a counterfactual baseline which marginalizes the co-player's action, with its action distribution inferred by a perspective-taking module. Comprehensive experiments are performed in spatially and temporally extended mixed-motive games, demonstrating LASE's ability to promote group collaboration without compromising fairness and its capacity to adapt policies to various types of interactive co-players.
