Trust as Monitoring: Evolutionary Dynamics of User Trust and AI Developer Behaviour

Adeela Bashir; Zhao Song; Ndidi Bianca Ogbo; Nataliya Balabanova; Martin Smit; Chin-wing Leung; Paolo Bova; Manuel Chica Serrano; Dhanushka Dissanayake; Manh Hong Duong; Elias Fernandez Domingos; Nikita Huber-Kralj; Marcus Krellner; Andrew Powell; Stefan Sarkadi; Fernando P. Santos; Zia Ush Shamszaman; Chaimaa Tarzi; Paolo Turrini; Grace Ibukunoluwa Ufeoshi; Victor A. Vargas-Perez; Alessandro Di Stefano; Simon T. Powers; The Anh Han

Trust as Monitoring: Evolutionary Dynamics of User Trust and AI Developer Behaviour

Adeela Bashir, Zhao Song, Ndidi Bianca Ogbo, Nataliya Balabanova, Martin Smit, Chin-wing Leung, Paolo Bova, Manuel Chica Serrano, Dhanushka Dissanayake, Manh Hong Duong, Elias Fernandez Domingos, Nikita Huber-Kralj, Marcus Krellner, Andrew Powell, Stefan Sarkadi, Fernando P. Santos, Zia Ush Shamszaman, Chaimaa Tarzi, Paolo Turrini, Grace Ibukunoluwa Ufeoshi, Victor A. Vargas-Perez, Alessandro Di Stefano, Simon T. Powers, The Anh Han

Abstract

AI safety is an increasingly urgent concern as the capabilities and adoption of AI systems grow. Existing evolutionary models of AI governance have primarily examined incentives for safe development and effective regulation, typically representing users' trust as a one-shot adoption choice rather than as a dynamic, evolving process shaped by repeated interactions. We instead model trust as reduced monitoring in a repeated, asymmetric interaction between users and AI developers, where checking AI behaviour is costly. Using evolutionary game theory, we study how user trust strategies and developer choices between safe (compliant) and unsafe (non-compliant) AI co-evolve under different levels of monitoring cost and institutional regimes. We complement the infinite-population replicator analysis with stochastic finite-population dynamics and reinforcement learning (Q-learning) simulations. Across these approaches, we find three robust long-run regimes: no adoption with unsafe development, unsafe but widely adopted systems, and safe systems that are widely adopted. Only the last is desirable, and it arises when penalties for unsafe behaviour exceed the extra cost of safety and users can still afford to monitor at least occasionally. Our results formally support governance proposals that emphasise transparency, low-cost monitoring, and meaningful sanctions, and they show that neither regulation alone nor blind user trust is sufficient to prevent evolutionary drift towards unsafe or low-adoption outcomes.

Trust as Monitoring: Evolutionary Dynamics of User Trust and AI Developer Behaviour

Abstract

Trust as Monitoring: Evolutionary Dynamics of User Trust and AI Developer Behaviour

Abstract

Paper Structure

Table of Contents

Figures (8)