Table of Contents
Fetching ...

Optimal Control of Fractional Punishment in Optional Public Goods Game

J. Grau, R. Botta, C. E. Schaerer

TL;DR

This work addresses sustaining cooperation in public goods games by optimizing a fractional punishment scheme through an optimal-control framework. It models the optional public goods game (OPGG) with state dynamics on the simplex and introduces a time-varying punishment fraction v(t) as the control, incorporating a multi-term cost that penalizes final-state error, trajectory error, control effort, and the frequency of sanctions. Using GEKKO/APOPT, the authors demonstrate that the time-varying optimal policy adapts to the prevalence of free-riders, applying stronger sanctions when cooperation is high and reducing sanctions when defection is high, ultimately achieving full cooperation with lower overall cost than fixed punishment strategies. The results highlight the practical value of an adaptive, cost-aware sanctioning framework for promoting cooperation in public-good settings, with implications for designing more efficient institutional punishments that balance effectiveness and resource expenditure.

Abstract

Punishment is probably the most frequently used mechanism to increase cooperation in Public Goods Games (PGG); however, it is expensive. To address this problem, this paper introduces an optimal control problem that uses fractional punishment to promote cooperation. We present a series of computational experiments illustrating the effects of single and combined terms of the optimization cost function. In the findings, the optimal controller outperforms the use of constant fractional punishment and gives an insight into the period and size of the penalization to be implemented with respect to the defection in the game.

Optimal Control of Fractional Punishment in Optional Public Goods Game

TL;DR

This work addresses sustaining cooperation in public goods games by optimizing a fractional punishment scheme through an optimal-control framework. It models the optional public goods game (OPGG) with state dynamics on the simplex and introduces a time-varying punishment fraction v(t) as the control, incorporating a multi-term cost that penalizes final-state error, trajectory error, control effort, and the frequency of sanctions. Using GEKKO/APOPT, the authors demonstrate that the time-varying optimal policy adapts to the prevalence of free-riders, applying stronger sanctions when cooperation is high and reducing sanctions when defection is high, ultimately achieving full cooperation with lower overall cost than fixed punishment strategies. The results highlight the practical value of an adaptive, cost-aware sanctioning framework for promoting cooperation in public-good settings, with implications for designing more efficient institutional punishments that balance effectiveness and resource expenditure.

Abstract

Punishment is probably the most frequently used mechanism to increase cooperation in Public Goods Games (PGG); however, it is expensive. To address this problem, this paper introduces an optimal control problem that uses fractional punishment to promote cooperation. We present a series of computational experiments illustrating the effects of single and combined terms of the optimization cost function. In the findings, the optimal controller outperforms the use of constant fractional punishment and gives an insight into the period and size of the penalization to be implemented with respect to the defection in the game.
Paper Structure (16 sections, 10 equations, 11 figures, 1 table, 1 algorithm)

This paper contains 16 sections, 10 equations, 11 figures, 1 table, 1 algorithm.

Figures (11)

  • Figure 1: Importance of $\alpha _1$. Part 1. Results obtained by simulating from 0 to 70 time units with 250 steps, from initial state $w_0=[0.2,0.7,0.1]^T$ (marked with a dot), using the controller curve shown in b). a) State space and trajectory of the system. b) Control effort $v$ over time.
  • Figure 2: Importance of $\alpha _1$. Part 2. Result obtained by simulating from 0 to 4 time units with 600 steps, with no controller ($v=0$), starting at $w_0=[0.998,0.001,0.001]^T$. A zoomed in simplex border can be seen in black, along with red time stamps in the trajectory traveled. The initial state error amounts to around 0.002449 while the final state error amounts to 0.007008, showing how unnoticeable small time deviations could be at the end of the simulation for this system.
  • Figure 3: Importance of $\alpha _2$. Result obtained by simulating from 0 to 70 time units with 250 steps, only considering $\alpha_2$ and with starting state $w_0=[0.2,0.7,0.1]^T$. a) State space and trajectory of the system. b) Control effort $v$ over time. c) Proportion of punished individuals $yv$ over time. The red shade represents the area under the curve and the legend shows its value.
  • Figure 4: Importance of $\alpha _3$ or $\alpha_4$. Result obtained by simulating from 0 to 70 time units with 600 steps, minimizing the control effort or the amount of sanctioned individuals, with starting state $w_0=[0.2,0.7,0.1]^T$. a) State space and trajectory of the system. b) Control effort $v$ over time. c) Proportion of punished individuals $yv$ over time. The red shade represents the area under the curve which in this case is zero.
  • Figure 5: Relative importance of $\alpha_3$ with respect to $\alpha _2$. Part 1. Result obtained by simulating from 0 to 70 time units with 400 steps, such that $\alpha_2$ is 0.999, 0.97, 0.94 and 0.91 and with starting state $w_0=[0.2,0.7,0.1]^T$. a) State space and trajectory of the system. b) Control effort $v$ over time. c) Proportion of punished individuals $yv$ over time. The corresponding shade represents the area under the curve and the legend shows its value.
  • ...and 6 more figures

Theorems & Definitions (3)

  • Remark 1
  • Remark 2
  • Remark 3