Table of Contents
Fetching ...

Deep reinforcement learning for irrigation scheduling using high-dimensional sensor feedback

Yuji Saikai, Allan Peake, Karine Chenu

TL;DR

This work addresses irrigation scheduling under high-dimensional sensor feedback by formulating it as a sequential decision problem and proposing a principled deep reinforcement learning framework. It employs a policy-gradient approach (REINFORCE) to learn a stochastic irrigation policy $\pi_\theta$ that maximizes the expected cumulative reward $\mathbb{E}\left[\sum_{t=0}^{T-1} R(S_t,A_t) + R^+\right]$. The method is demonstrated in a case study with APSIM-Wheat in Goondiwindi, Australia, where a nine-variable state and five-action set yield a neural network with $1{,}448{,}005$ parameters trained over $20{,}000$ episodes; the resulting policy consistently outperforms a conventional replenishment benchmark, achieving up to 17% higher profits in some years and reaching 96–98.8% of the year-specific profit potential. The framework is presented as general and adaptable to a wide range of cropping systems and management goals, with emphasis on constructing high-fidelity learning environments and exploring future extensions such as multi-crop optimization, water budgets, and integration of forecasts.

Abstract

Deep reinforcement learning has considerable potential to improve irrigation scheduling in many cropping systems by applying adaptive amounts of water based on various measurements over time. The goal is to discover an intelligent decision rule that processes information available to growers and prescribes sensible irrigation amounts for the time steps considered. Due to the technical novelty, however, the research on the technique remains sparse and impractical. To accelerate the progress, the paper proposes a principled framework and actionable procedure that allow researchers to formulate their own optimisation problems and implement solution algorithms based on deep reinforcement learning. The effectiveness of the framework was demonstrated using a case study of irrigated wheat grown in a productive region of Australia where profits were maximised. Specifically, the decision rule takes nine state variable inputs: crop phenological stage, leaf area index, extractable soil water for each of the five top layers, cumulative rainfall and cumulative irrigation. It returns a probabilistic prescription over five candidate irrigation amounts (0, 10, 20, 30 and 40 mm) every day. The production system was simulated at Goondiwindi using the APSIM-Wheat crop model. After training in the learning environment using 1981-2010 weather data, the learned decision rule was tested individually for each year of 2011-2020. The results were compared against the benchmark profits obtained by a conventional rule common in the region. The discovered decision rule prescribed daily irrigation amounts that uniformly improved on the conventional rule for all the testing years, and the largest improvement reached 17% in 2018. The framework is general and applicable to a wide range of cropping systems with realistic optimisation problems.

Deep reinforcement learning for irrigation scheduling using high-dimensional sensor feedback

TL;DR

This work addresses irrigation scheduling under high-dimensional sensor feedback by formulating it as a sequential decision problem and proposing a principled deep reinforcement learning framework. It employs a policy-gradient approach (REINFORCE) to learn a stochastic irrigation policy that maximizes the expected cumulative reward . The method is demonstrated in a case study with APSIM-Wheat in Goondiwindi, Australia, where a nine-variable state and five-action set yield a neural network with parameters trained over episodes; the resulting policy consistently outperforms a conventional replenishment benchmark, achieving up to 17% higher profits in some years and reaching 96–98.8% of the year-specific profit potential. The framework is presented as general and adaptable to a wide range of cropping systems and management goals, with emphasis on constructing high-fidelity learning environments and exploring future extensions such as multi-crop optimization, water budgets, and integration of forecasts.

Abstract

Deep reinforcement learning has considerable potential to improve irrigation scheduling in many cropping systems by applying adaptive amounts of water based on various measurements over time. The goal is to discover an intelligent decision rule that processes information available to growers and prescribes sensible irrigation amounts for the time steps considered. Due to the technical novelty, however, the research on the technique remains sparse and impractical. To accelerate the progress, the paper proposes a principled framework and actionable procedure that allow researchers to formulate their own optimisation problems and implement solution algorithms based on deep reinforcement learning. The effectiveness of the framework was demonstrated using a case study of irrigated wheat grown in a productive region of Australia where profits were maximised. Specifically, the decision rule takes nine state variable inputs: crop phenological stage, leaf area index, extractable soil water for each of the five top layers, cumulative rainfall and cumulative irrigation. It returns a probabilistic prescription over five candidate irrigation amounts (0, 10, 20, 30 and 40 mm) every day. The production system was simulated at Goondiwindi using the APSIM-Wheat crop model. After training in the learning environment using 1981-2010 weather data, the learned decision rule was tested individually for each year of 2011-2020. The results were compared against the benchmark profits obtained by a conventional rule common in the region. The discovered decision rule prescribed daily irrigation amounts that uniformly improved on the conventional rule for all the testing years, and the largest improvement reached 17% in 2018. The framework is general and applicable to a wide range of cropping systems with realistic optimisation problems.
Paper Structure (15 sections, 9 equations, 3 figures, 3 tables, 1 algorithm)

This paper contains 15 sections, 9 equations, 3 figures, 3 tables, 1 algorithm.

Figures (3)

  • Figure 1: The neural network architecture used in the case study. The diagram indicates, for a given time $t$, nine state variables (i.e., Stage, LAI, ESW of five soil layers, CuIrrig and CuRain) at the leftmost input layer and five candidate actions (i.e., irrigation of 0, 10, 20, 30 and 40mm) at the rightmost output layer. At every decision making, the neural network takes nine inputs from the sensors and returns five probabilities for action prescription. Due to the space restriction, only 4% of the total number of nodes at each of five middle layers are drawn in the figure. A bias node (i.e., intercept term) is also drawn at the top of each of the middle layers.
  • Figure 2: Profits resulting from the benchmark replenishment rule and the learned decision rule over the training (1981--2010) and the testing (2011--2020) years. Each profit of the learned decision rule is the average of 30 replicates.
  • Figure 3: Prescribed daily irrigation probabilities from sowing (Day 122) to stage 85 (Day 285) in two of 30 replicates for Year 2020. The first replicate is presented in (a) and the second one in (b). Dots ($\bullet$) and triangles ($\blacktriangledown)$ represent daily rainfall and realised irrigation amounts respectively.