Table of Contents
Fetching ...

Adaptive bias for dissensus in nonlinear opinion dynamics with application to evolutionary division of labor games

Tyler M. Paine, Anastasia Bizyaeva, Michael R. Benjamin

TL;DR

The paper addresses adaptive control of a dissensus bias in nonlinear opinion dynamics to optimize collective rewards by partitioning a population into two task groups. It fuses nonlinear opinion dynamics with an evolutionary division of labor game, deriving conditions for steerable allocations via decentralized feedback and for adaptive payoff estimation under persistent excitation. A complete adaptive bias controller is developed, integrating payoff estimation through consensus and bias tuning to drive the population toward Nash equilibrium behavior. The framework is shown to be scalable and decentralized, with simulations demonstrating convergence toward the NE despite changing network connectivity and unknown payoffs, indicating practical applicability for large swarms and autonomous decision-making tasks.

Abstract

This paper addresses the problem of adaptively controlling the bias parameter in nonlinear opinion dynamics (NOD) to allocate agents into groups of arbitrary sizes for the purpose of maximizing collective rewards. In previous work, an algorithm based on the coupling of NOD with an multi-objective behavior optimization was successfully deployed as part of a multi-robot system in an autonomous task allocation field experiment. Motivated by the field results, in this paper we propose and analyze a new task allocation model that synthesizes NOD with an evolutionary game framework. We prove sufficient conditions under which it is possible to control the opinion state in the group to a desired allocation of agents between two tasks through an adaptive bias using decentralized feedback. We then verify the theoretical results with a simulation study of a collaborative evolutionary division of labor game.

Adaptive bias for dissensus in nonlinear opinion dynamics with application to evolutionary division of labor games

TL;DR

The paper addresses adaptive control of a dissensus bias in nonlinear opinion dynamics to optimize collective rewards by partitioning a population into two task groups. It fuses nonlinear opinion dynamics with an evolutionary division of labor game, deriving conditions for steerable allocations via decentralized feedback and for adaptive payoff estimation under persistent excitation. A complete adaptive bias controller is developed, integrating payoff estimation through consensus and bias tuning to drive the population toward Nash equilibrium behavior. The framework is shown to be scalable and decentralized, with simulations demonstrating convergence toward the NE despite changing network connectivity and unknown payoffs, indicating practical applicability for large swarms and autonomous decision-making tasks.

Abstract

This paper addresses the problem of adaptively controlling the bias parameter in nonlinear opinion dynamics (NOD) to allocate agents into groups of arbitrary sizes for the purpose of maximizing collective rewards. In previous work, an algorithm based on the coupling of NOD with an multi-objective behavior optimization was successfully deployed as part of a multi-robot system in an autonomous task allocation field experiment. Motivated by the field results, in this paper we propose and analyze a new task allocation model that synthesizes NOD with an evolutionary game framework. We prove sufficient conditions under which it is possible to control the opinion state in the group to a desired allocation of agents between two tasks through an adaptive bias using decentralized feedback. We then verify the theoretical results with a simulation study of a collaborative evolutionary division of labor game.
Paper Structure (16 sections, 7 theorems, 25 equations, 5 figures)

This paper contains 16 sections, 7 theorems, 25 equations, 5 figures.

Key Result

Proposition 1

Consider eq:consensus and assume that $Q(t)$ is doubly stochastic and compatible with a strongly connected graph $G(\Omega, \mathcal{E})$ for all $t \in \mathbb{N}$. Then $\lim_{t \rightarrow \infty} \chi_i(t) = \frac{1}{N}\sum_{i=1}^N \chi_i(0)\ \forall i = 1,2,\hdots n$.

Figures (5)

  • Figure 1: Overview of our approach. As time progresses the multi-robot system uses biased dissensus to allocate robots in an iterative division of labor game. An example application to a seek-and-sample scenario is shown where the locations of patches to be sampled change over time. Robots that chose to explore (pink) cooperatively search for unknown patches (yellow), while sampling robots (blue) cooperatively exploit known patches (green). Communication is limited by the environment and network connections between robots are shown in black.
  • Figure 2: Geometric view of the arguments in Theorem \ref{['th:dynam_interlacing']}. The position of the equilibrium point $\bm{x}^*$ lies on a semi-circle arc on a plane that passes through the origin and contains the vectors $\bm{v}^*$ and $J^{-1}\bm{1}$. The location of $\bm{x}^*$ is parameterized by $\theta_b$ and therefore $b$.
  • Figure 3: Evolution of the opinion state of 100 networked robots participating in an evolutionary game where each robot plays one of two options depending on the sign of their opinion. Adaptation begins part way through the simulation, and after a period of excitation required to collectively estimate the underlying payoff matrix the population state approaches the goal state, the Nash Equalibrium, as desired. Watts–Strogatz model with mean degree $K=10$, $\beta=0.1$Top: Opinion trajectories of all 100 robots where the inset detailed view shows a typical epoch with opinions approaching equilibrium during the initial decision-formation period. $d = 0.5$, $\gamma = -0.03$, $\alpha = 0.3$. Middle: After adaptation begins the evolution of the observed mixed strategy (\ref{['eq:y_obs']}) in a system with the adaptive bias (\ref{['eq:adaptive_bias_update']}) follows the estimate of the mixed strategy Nash Equilibrium via replicator dynamics (\ref{['eq:ref_dynamics_update']}) which converges to the true Nash Equilibrium. Bottom: The adaptive bias term during the simulated game iterations.
  • Figure 4: Simulated demonstration of Theorem \ref{['th:dynam_interlacing']} and Conjecture \ref{['conj:nod_interlacing']} for the nonlinear system (\ref{['eq:nod_vector']}). Eight robot nodes in a randomly generated Barabási–Albert network ($m=2$) are subject to a monotonically increasing bias $b$. a) The opinion trajectories cross zero one at a time as bias increases. b) Bias $b$ step increases. c) Randomly generated network topology, $d = 0.5$, $\gamma = -0.03$, $\alpha = 0.3$
  • Figure 5: Evolution of errors in estimate of the payoff matrix during iterations of the evolutionary game in Figure \ref{['fig:evo_game_adapt_bias']}. Top: Error in estimate of each entry tends to zero after adaptation begins. Bottom: Value of the Lyapunov function asymptotically tends to zero after adaptation begins.

Theorems & Definitions (16)

  • Proposition 1: Garin2010 Theorem 3.2
  • Proposition 2: Cressman2014replicator Theorem 1
  • Proposition 3: Competitive Agents Bizyaeva2023NOD Corollary IV.1.2.B
  • Remark
  • Lemma 1
  • proof
  • Corollary 1
  • proof
  • Theorem 1: Opinion Datum Interlacing of Linearized Bounded System
  • proof
  • ...and 6 more