Terra Nova: A Comprehensive Challenge Environment for Intelligent Agents
Trevor McInroe
TL;DR
Terra Nova introduces a Civ-inspired comprehensive challenge environment (CCE) for reinforcement learning that requires integrating partial observability, long-horizon credit assignment, representation learning, and an enormous action space within a single, interacting system. It formalizes Terra Nova as a turn-based partially observable stochastic game (POSG) with a high-dimensional observation space and an enormous action space ($\sim 10^{745}$), featuring four mutually exclusive victory conditions to encourage varied strategic trade-offs. The paper also describes a rich software stack—procedurally generated maps, distributed training capabilities, recording/viewing tools, and a starter neural network architecture—to accelerate research and evaluation. Overall, Terra Nova provides a stringent benchmark for assessing general intelligence in RL and highlighting limitations of current approaches in integrated, multi-challenge settings.
Abstract
We introduce Terra Nova, a new comprehensive challenge environment (CCE) for reinforcement learning (RL) research inspired by Civilization V. A CCE is a single environment in which multiple canonical RL challenges (e.g., partial observability, credit assignment, representation learning, enormous action spaces, etc.) arise simultaneously. Mastery therefore demands integrated, long-horizon understanding across many interacting variables. We emphasize that this definition excludes challenges that only aggregate unrelated tasks in independent, parallel streams (e.g., learning to play all Atari games at once). These aggregated multitask benchmarks primarily asses whether an agent can catalog and switch among unrelated policies rather than test an agent's ability to perform deep reasoning across many interacting challenges.
