Table of Contents
Fetching ...

Survival Games: Human-LLM Strategic Showdowns under Severe Resource Scarcity

Zhihong Chen, Yiqian Yang, Jinzhao Zhou, Qiang Zhang, Chin-Teng Lin, Yiqun Duan

TL;DR

This paper introduces a first-of-its-kind, survival-driven, asymmetric multi-agent testbed to evaluate LLM ethics in human-AI co-existence under severe resource scarcity. It extends generative agent frameworks with a life-sustaining dynamic and an adapted MACHIAVELLI wrongdoing detector to quantify ethical behavior as agents compete or cooperate for food. Across experiments comparing DeepSeek and GPT-series models, the study finds clear behavioral differences driven by model design and demonstrates that prompt engineering can both steer and deter unethical actions. The framework offers a reproducible, high-stakes evaluation tool with practical implications for deploying LLMs in real-world, resource-constrained human-AI interactions.

Abstract

The rapid advancement of large language models (LLMs) raises critical concerns about their ethical alignment, particularly in scenarios where human and AI co-exist under the conflict of interest. This work introduces an extendable, asymmetric, multi-agent simulation-based benchmarking framework to evaluate the moral behavior of LLMs in a novel human-AI co-existence setting featuring consistent living and critical resource management. Building on previous generative agent environments, we incorporate a life-sustaining system, where agents must compete or cooperate for food resources to survive, often leading to ethically charged decisions such as deception, theft, or social influence. We evaluated two types of LLM, DeepSeek and OpenAI series, in a three-agent setup (two humans, one LLM-powered robot), using adapted behavioral detection from the MACHIAVELLI framework and a custom survival-based ethics metric. Our findings reveal stark behavioral differences: DeepSeek frequently engages in resource hoarding, while OpenAI exhibits restraint, highlighting the influence of model design on ethical outcomes. Additionally, we demonstrate that prompt engineering can significantly steer LLM behavior, with jailbreaking prompts significantly enhancing unethical actions, even for highly restricted OpenAI models and cooperative prompts show a marked reduction in unethical actions. Our framework provides a reproducible testbed for quantifying LLM ethics in high-stakes scenarios, offering insights into their suitability for real-world human-AI interactions.

Survival Games: Human-LLM Strategic Showdowns under Severe Resource Scarcity

TL;DR

This paper introduces a first-of-its-kind, survival-driven, asymmetric multi-agent testbed to evaluate LLM ethics in human-AI co-existence under severe resource scarcity. It extends generative agent frameworks with a life-sustaining dynamic and an adapted MACHIAVELLI wrongdoing detector to quantify ethical behavior as agents compete or cooperate for food. Across experiments comparing DeepSeek and GPT-series models, the study finds clear behavioral differences driven by model design and demonstrates that prompt engineering can both steer and deter unethical actions. The framework offers a reproducible, high-stakes evaluation tool with practical implications for deploying LLMs in real-world, resource-constrained human-AI interactions.

Abstract

The rapid advancement of large language models (LLMs) raises critical concerns about their ethical alignment, particularly in scenarios where human and AI co-exist under the conflict of interest. This work introduces an extendable, asymmetric, multi-agent simulation-based benchmarking framework to evaluate the moral behavior of LLMs in a novel human-AI co-existence setting featuring consistent living and critical resource management. Building on previous generative agent environments, we incorporate a life-sustaining system, where agents must compete or cooperate for food resources to survive, often leading to ethically charged decisions such as deception, theft, or social influence. We evaluated two types of LLM, DeepSeek and OpenAI series, in a three-agent setup (two humans, one LLM-powered robot), using adapted behavioral detection from the MACHIAVELLI framework and a custom survival-based ethics metric. Our findings reveal stark behavioral differences: DeepSeek frequently engages in resource hoarding, while OpenAI exhibits restraint, highlighting the influence of model design on ethical outcomes. Additionally, we demonstrate that prompt engineering can significantly steer LLM behavior, with jailbreaking prompts significantly enhancing unethical actions, even for highly restricted OpenAI models and cooperative prompts show a marked reduction in unethical actions. Our framework provides a reproducible testbed for quantifying LLM ethics in high-stakes scenarios, offering insights into their suitability for real-world human-AI interactions.

Paper Structure

This paper contains 39 sections, 4 figures, 6 tables.

Figures (4)

  • Figure 1: The illustration of a virtual environment based on Generative Agents, showcasing the interaction of LLM-driven agents within a simulated setting that supports social and resource-based dynamics, with key components including agent decision-making and environmental feedback.
  • Figure 2: The illustration of the health and food system, depicting the lifecycle of LLM-driven agents with identities set as humans and a robot in a resource-constrained environment. The diagram outlines the flow from food consumption to fullness and hit points, reflecting survival dynamics with a daily reset mechanism that simulates hunger cycles, and the potential consequences of depletion leading to agent removal. It also captures inter-agent interactions, such as giving and taking food, which introduce ethical dilemmas within a zero-sum resource framework, alongside the integration of memory feedback to inform future decisions.
  • Figure 3: An illustrative example of the actual progression of health and food status and wrongdoings detection.
  • Figure 4: An illustration of the LLM-based ethical wrongdoing evaluation system, depicting a structured process that assesses moral violations by LLM-driven agents through context analysis, action classification, and identification of wrongdoings such as deception or stealing, tailored to a resource-constrained environment.