Table of Contents
Fetching ...

Terra Nova: A Comprehensive Challenge Environment for Intelligent Agents

Trevor McInroe

TL;DR

Terra Nova introduces a Civ-inspired comprehensive challenge environment (CCE) for reinforcement learning that requires integrating partial observability, long-horizon credit assignment, representation learning, and an enormous action space within a single, interacting system. It formalizes Terra Nova as a turn-based partially observable stochastic game (POSG) with a high-dimensional observation space and an enormous action space ($\sim 10^{745}$), featuring four mutually exclusive victory conditions to encourage varied strategic trade-offs. The paper also describes a rich software stack—procedurally generated maps, distributed training capabilities, recording/viewing tools, and a starter neural network architecture—to accelerate research and evaluation. Overall, Terra Nova provides a stringent benchmark for assessing general intelligence in RL and highlighting limitations of current approaches in integrated, multi-challenge settings.

Abstract

We introduce Terra Nova, a new comprehensive challenge environment (CCE) for reinforcement learning (RL) research inspired by Civilization V. A CCE is a single environment in which multiple canonical RL challenges (e.g., partial observability, credit assignment, representation learning, enormous action spaces, etc.) arise simultaneously. Mastery therefore demands integrated, long-horizon understanding across many interacting variables. We emphasize that this definition excludes challenges that only aggregate unrelated tasks in independent, parallel streams (e.g., learning to play all Atari games at once). These aggregated multitask benchmarks primarily asses whether an agent can catalog and switch among unrelated policies rather than test an agent's ability to perform deep reasoning across many interacting challenges.

Terra Nova: A Comprehensive Challenge Environment for Intelligent Agents

TL;DR

Terra Nova introduces a Civ-inspired comprehensive challenge environment (CCE) for reinforcement learning that requires integrating partial observability, long-horizon credit assignment, representation learning, and an enormous action space within a single, interacting system. It formalizes Terra Nova as a turn-based partially observable stochastic game (POSG) with a high-dimensional observation space and an enormous action space (), featuring four mutually exclusive victory conditions to encourage varied strategic trade-offs. The paper also describes a rich software stack—procedurally generated maps, distributed training capabilities, recording/viewing tools, and a starter neural network architecture—to accelerate research and evaluation. Overall, Terra Nova provides a stringent benchmark for assessing general intelligence in RL and highlighting limitations of current approaches in integrated, multi-challenge settings.

Abstract

We introduce Terra Nova, a new comprehensive challenge environment (CCE) for reinforcement learning (RL) research inspired by Civilization V. A CCE is a single environment in which multiple canonical RL challenges (e.g., partial observability, credit assignment, representation learning, enormous action spaces, etc.) arise simultaneously. Mastery therefore demands integrated, long-horizon understanding across many interacting variables. We emphasize that this definition excludes challenges that only aggregate unrelated tasks in independent, parallel streams (e.g., learning to play all Atari games at once). These aggregated multitask benchmarks primarily asses whether an agent can catalog and switch among unrelated policies rather than test an agent's ability to perform deep reasoning across many interacting challenges.

Paper Structure

This paper contains 7 sections, 5 figures, 1 table.

Figures (5)

  • Figure 1: An example procedurally-generated Terra Nova map. The map is a central landmass surrounded by ocean and made of hexagonal tiles. The landmass is filled with various terrain types (e.g., desert, plains, grassland), features (e.g., oases, flood plains, jungles), elevation (e.g., flatland, hills, mountains), resources, water features, natural wonders, and more. For more information on maps, see the documentation here: https://trevormcinroe.github.io/terra_nova_environment#maps-mech
  • Figure 2: The beginning of the technology tree in Terra Nova. Specific technologies are represented with the rectangular emblems containing the technology name and science cost of unlocking. Prerequisite relationships are shown with the connecting gray lines. For example, to begin research on Engineering (towards the bottom of the 2nd column in the "Classical" Era), agents must first unlock Archery, Animal Husbandry, The Wheel, Mathematics, Mining, Masonry, and Construction.
  • Figure 3: An example initial observation for an agent. In Terra Nova, agents begin the game with one Settler (represented with the triangle emblem and flag icon at the bottom of the center hex) and one Warrior (represented with the circle emblem and axe icon at the top of the center hex). This agent was initially spawned on the coast with a resource-rich section of the sea directly to its west and a large plains area to its east. On this initial turn, the agent must decide where to settle its capital city, weighing either settling in place on the coast or revealing more of the map ("unknown" areas shaded in black) by moving its units.
  • Figure 4: An example demographics screen in the Terra Nova Viewer. Displayed here is the total population in each agent's empire plotted over 296 game turns. Users can view many other statistics using the dropdown menu on the top-left of the Demographics screen. Additionally, the "Environment Overview" bar on the top-left of the Viewer contains buttons for many other information screens, and the map is zoomable and draggable, giving the user a complete view of the game.
  • Figure :