Table of Contents
Fetching ...

WOFOSTGym: A Crop Simulator for Learning Annual and Perennial Crop Management Strategies

William Solow, Sandhya Saisubramanian, Alan Fern

TL;DR

This work introduces WOFOSTGym, a high-fidelity RL environment for annual and perennial crop management built on the WOFOST CGM, enabling single- and multi-farm, multi-year experimentation with 23 annual crops and 2 perennials. It provides 54 configurable Gym environments, domain randomization, and a Bayesian calibration workflow to improve sim fidelity and support sim-to-real transfer, addressing critical RL challenges in agriculture such as delayed rewards and partial observability. Through PPO, SAC, DQN, and imitation-learning baselines, the paper demonstrates both the potential and current limitations of off-the-shelf RL/IL methods in achieving high yields under realistic constraints, and highlights the platform as a rigorous testbed for developing new algorithms. The work also benchmarks run times against other crop simulators, and outlines future extensions to support crop rotations and faster sim-to-real transfer, underscoring the practical impact of WOFOSTGym for agricultural decision support and RL research.

Abstract

We introduce WOFOSTGym, a novel crop simulation environment designed to train reinforcement learning (RL) agents to optimize agromanagement decisions for annual and perennial crops in single and multi-farm settings. Effective crop management requires optimizing yield and economic returns while minimizing environmental impact, a complex sequential decision-making problem well suited for RL. However, the lack of simulators for perennial crops in multi-farm contexts has hindered RL applications in this domain. Existing crop simulators also do not support multiple annual crops. WOFOSTGym addresses these gaps by supporting 23 annual crops and two perennial crops, enabling RL agents to learn diverse agromanagement strategies in multi-year, multi-crop, and multi-farm settings. Our simulator offers a suite of challenging tasks for learning under partial observability, non-Markovian dynamics, and delayed feedback. WOFOSTGym's standard RL interface allows researchers without agricultural expertise to explore a wide range of agromanagement problems. Our experiments demonstrate the learned behaviors across various crop varieties and soil types, highlighting WOFOSTGym's potential for advancing RL-driven decision support in agriculture.

WOFOSTGym: A Crop Simulator for Learning Annual and Perennial Crop Management Strategies

TL;DR

This work introduces WOFOSTGym, a high-fidelity RL environment for annual and perennial crop management built on the WOFOST CGM, enabling single- and multi-farm, multi-year experimentation with 23 annual crops and 2 perennials. It provides 54 configurable Gym environments, domain randomization, and a Bayesian calibration workflow to improve sim fidelity and support sim-to-real transfer, addressing critical RL challenges in agriculture such as delayed rewards and partial observability. Through PPO, SAC, DQN, and imitation-learning baselines, the paper demonstrates both the potential and current limitations of off-the-shelf RL/IL methods in achieving high yields under realistic constraints, and highlights the platform as a rigorous testbed for developing new algorithms. The work also benchmarks run times against other crop simulators, and outlines future extensions to support crop rotations and faster sim-to-real transfer, underscoring the practical impact of WOFOSTGym for agricultural decision support and RL research.

Abstract

We introduce WOFOSTGym, a novel crop simulation environment designed to train reinforcement learning (RL) agents to optimize agromanagement decisions for annual and perennial crops in single and multi-farm settings. Effective crop management requires optimizing yield and economic returns while minimizing environmental impact, a complex sequential decision-making problem well suited for RL. However, the lack of simulators for perennial crops in multi-farm contexts has hindered RL applications in this domain. Existing crop simulators also do not support multiple annual crops. WOFOSTGym addresses these gaps by supporting 23 annual crops and two perennial crops, enabling RL agents to learn diverse agromanagement strategies in multi-year, multi-crop, and multi-farm settings. Our simulator offers a suite of challenging tasks for learning under partial observability, non-Markovian dynamics, and delayed feedback. WOFOSTGym's standard RL interface allows researchers without agricultural expertise to explore a wide range of agromanagement problems. Our experiments demonstrate the learned behaviors across various crop varieties and soil types, highlighting WOFOSTGym's potential for advancing RL-driven decision support in agriculture.

Paper Structure

This paper contains 33 sections, 7 figures, 5 tables.

Figures (7)

  • Figure 1: The structure and visualization of the WOFOSTGym simulator. WOFOSTGym provides an API around the WOFOST Crop Growth Model with a variety of environments to train RL agents and generate data. Well documented configuration files control crop and soil dynamics.
  • Figure 2: Unconstrained Control. The average reward, as seasonal yield, of different policies. The BiWeekly NW policy alternates applying nitrogen and water biweekly while the Wheat Potential is the maximum growth obtainable. We omit the Jujube Potential because it assumes daily intervention, while we only allow biweekly intervention.
  • Figure 3: Constrained Control. (Left) The running average of the reward during training. (Right) The likelihood of fertilization or irrigation action each week. Likelihoods were computed over 30 episodes with darker colors signifying more likely nutrient application.
  • Figure 4: Constrained Control Under Partial Observability. The average reward of PPO agents during training and the average days of runoff after completing training over 15 seasons.
  • Figure 5: (Left) The soil moisture content of each field under three joint RL agromanagement policies. (Right) The average yield obtained by trained multi-field agents. Lighter colors indicate the yield obtained by an agent trained on that specific field as a baseline for obtainable crop yield.
  • ...and 2 more figures