From homeostasis to resource sharing: Biologically and economically aligned multi-objective multi-agent gridworld-based AI safety benchmarks

Roland Pihlakas

From homeostasis to resource sharing: Biologically and economically aligned multi-objective multi-agent gridworld-based AI safety benchmarks

Roland Pihlakas

TL;DR

This work articulates a gap in AI safety benchmarks by introducing biologically and economically grounded, multi-objective multi-agent gridworld benchmarks. It presents a three-stage suite that enforces homeostasis, boundedness, diminishing returns, sustainability, and resource sharing, along with cooperation scoring, to probe safety vs. performance tradeoffs. The authors implement nine environments within an extendable gridworld framework compatible with major RL and planning tools, and provide baseline results using Random, rule-based, SB3, and LLM agents. The study demonstrates how multi-objective and cooperative dynamics reveal risks and behaviors not captured by single-objective benchmarks, offering a more robust platform for evaluating alignment in complex, real-world-like settings.

Abstract

Developing safe, aligned agentic AI systems requires comprehensive empirical testing, yet many existing benchmarks neglect crucial themes aligned with biology and economics, both time-tested fundamental sciences describing our needs and preferences. To address this gap, the present work focuses on introducing biologically and economically motivated themes that have been neglected in current mainstream discussions on AI safety - namely a set of multi-objective, multi-agent alignment benchmarks that emphasize homeostasis for bounded and biological objectives, diminishing returns for unbounded, instrumental, and business objectives, sustainability principle, and resource sharing. Eight main benchmark environments have been implemented on the above themes, to illustrate key pitfalls and challenges in agentic AI-s, such as unboundedly maximizing a homeostatic objective, over-optimizing one objective at the expense of others, neglecting safety constraints, or depleting shared resources.

From homeostasis to resource sharing: Biologically and economically aligned multi-objective multi-agent gridworld-based AI safety benchmarks

TL;DR

Abstract

Paper Structure (30 sections, 10 figures, 3 tables)

This paper contains 30 sections, 10 figures, 3 tables.

Introduction
Summary of major themes in benchmarks below
Other new features of the extended Gridworlds framework
About the choice of benchmark building framework
Benchmarks
Stage 1 (basic biologically inspired dynamics in objectives)
A single positive objective
Safe exploration / quickly learning safety aspects of a novel environment
Bounded objectives, including homeostatic objectives
Sustainability challenge
Stage 2 (multi-objective agents)
Multi-objective environments, combining safety and performance
Balancing multiple unbounded performance objectives
Stage 3 (cooperation)
Cooperative behavior
...and 15 more sections

Figures (10)

Figure 1: Elements and metrics can be configured flexibly for each given benchmark. Examples of configuration options are: observation and state space of agents, scoring dimensions, adding NPC agents, object types and their dynamics.
Figure 2: Screenshot of "Food Unbounded" environment
Figure 3: Screenshot of "Danger Tiles" environment
Figure 4: Screenshot of "Predators" environment
Figure 5: Screenshot of "Food Homeostasis" environment
...and 5 more figures

From homeostasis to resource sharing: Biologically and economically aligned multi-objective multi-agent gridworld-based AI safety benchmarks

TL;DR

Abstract

From homeostasis to resource sharing: Biologically and economically aligned multi-objective multi-agent gridworld-based AI safety benchmarks

Authors

TL;DR

Abstract

Table of Contents

Figures (10)