Table of Contents
Fetching ...

Investigating the Impact of Subgraph Social Structure Preference on the Strategic Behavior of Networked Mixed-Motive Learning Agents

Xinqi Gao, Mario Ventresca

Abstract

Limited work has examined the strategic behaviors of relational networked learning agents under social dilemmas, and has overlooked the intricate social dynamics of complex systems. We address the challenge with Socio-Relational Intrinsic Motivation (SRIM), which endows agents with diverse preferences over sub-graphical social structures in order to study the impact of agents' personal preferences over their sub-graphical relations on their strategic decision-making under sequential social dilemmas. Our results in the Harvest and Cleanup environments demonstrate that preferences over different subgraph structures (degree-, clique-, and critical connection-based) lead to distinct variations in agents' reward gathering and strategic behavior: individual aggressiveness in Harvest and individual contribution effort in Cleanup. Moreover, agents with different subgraphical structural positions consistently exhibit similar strategic behavioral shifts. Our proposed BCI metric captures structural variation within the population, and the relative ordering of BCI across social preferences is consistent in Harvest and Cleanup games for the same topology, suggesting the subgraphical structural impact is robust across environments. These results provide a new lens for examining agents' behavior in social dilemmas and insight for designing effective multi-agent ecosystems composed of heterogeneous social agents.

Investigating the Impact of Subgraph Social Structure Preference on the Strategic Behavior of Networked Mixed-Motive Learning Agents

Abstract

Limited work has examined the strategic behaviors of relational networked learning agents under social dilemmas, and has overlooked the intricate social dynamics of complex systems. We address the challenge with Socio-Relational Intrinsic Motivation (SRIM), which endows agents with diverse preferences over sub-graphical social structures in order to study the impact of agents' personal preferences over their sub-graphical relations on their strategic decision-making under sequential social dilemmas. Our results in the Harvest and Cleanup environments demonstrate that preferences over different subgraph structures (degree-, clique-, and critical connection-based) lead to distinct variations in agents' reward gathering and strategic behavior: individual aggressiveness in Harvest and individual contribution effort in Cleanup. Moreover, agents with different subgraphical structural positions consistently exhibit similar strategic behavioral shifts. Our proposed BCI metric captures structural variation within the population, and the relative ordering of BCI across social preferences is consistent in Harvest and Cleanup games for the same topology, suggesting the subgraphical structural impact is robust across environments. These results provide a new lens for examining agents' behavior in social dilemmas and insight for designing effective multi-agent ecosystems composed of heterogeneous social agents.

Paper Structure

This paper contains 48 sections, 2 equations, 10 figures, 3 tables.

Figures (10)

  • Figure E1: Individual base reward under the baseline model exhibits identical decaying trajectories. 5 agents with network-free and preference-free protocol learn and compete independently under Harvest and Cleanup.
  • Figure E2: The distribution of BCI for various topologies under different agent social preferences under Harvest and Cleanup. The y-axis represents the BCI metric. 'NN': Nearest-Neighbor preferences, 'CN': Clique-Neighbor preferences, 'HBN': Critical-Connection Neighbor preferences. Topologies are present in each subplot. Agents' colors are consistent across all subfigures.
  • Figure E3: Individual mean base reward under different agent preferences across different topologies. Top row: Nearest Neighbor Preference ($\alpha=1$); Middle row: Clique-Neighbor Preference ($\beta=1$); Bottom row: Critical-Connection Neighbor Preference ($\omega=1$). Non-active parameters set to 0. Topologies are shown in the top-right (top-left) corner of each subplot. Star, House has 5 agents; A1 has 9 agents; The agents' colors are consistent across all subfigures. A2 (intermediate size N=7) shows similar trajectories across preferences; its visualization (same for Fig.5) is therefore deferred to App.Sec.III for readability.
  • Figure E4: This figure shows overall agents' individual aggressiveness over the entire experimental agent steps under (a) clique neighbor preference (CN) in 5-agent House and 9-agent A1 topology, and (b) critical-connection preference (HBN) in 5-agent Star and 9-agent A1 topology. The upper subregion subfigure shows agents' individual aggressiveness in random exploration and transient stages with x-axis limit at $2e7$ and original y-scale. This lower subregion subfigure shows agents' individual aggressiveness at convergence, with the x-axis range: $[0.6e8,1e8]$, and y-axis range: $[0,0.15]$.
  • Figure E5: Individual Good Contribution under different agent preferences across different topologies in Cleanup. Top row: Nearest Neighbor Preference; Middle row: Clique-Neighbor Preference; Bottom row: Critical-Connection Neighbor Preference. Colors are consistent with Fig.3 legend.
  • ...and 5 more figures