Table of Contents
Fetching ...

Data-driven Crop Growth Simulation on Time-varying Generated Images using Multi-conditional Generative Adversarial Networks

Lukas Drees, Dereje T. Demie, Madhuri R. Paul, Johannes Leonhardt, Sabine J. Seidel, Thomas F. Döring, Ribana Roscher

TL;DR

The paper introduces a two-stage data-driven crop growth framework that first generates time-varying plant images using a multi-conditional CWGAN-GP and then derives plant traits (PLA or biomass) from the generated images via dedicated growth-estimation models. By integrating multiple conditioned variables (time, treatment, and simulated biomass) through conditional batch normalization, the approach produces realistic temporal image sequences across three crop systems and enables data-driven simulations of management changes. Growth estimates derived from generated images are shown to be sufficiently accurate to evaluate and compare against reference process-based models, demonstrating the potential to link image-based phenotyping with process-based crop growth insights. Transferability to a new site faces challenges due to style and spectral differences, highlighting the need for site-aware conditioning or expanded multi-site training to improve generalization and applicability in heterogeneous agricultural settings.

Abstract

Image-based crop growth modeling can substantially contribute to precision agriculture by revealing spatial crop development over time, which allows an early and location-specific estimation of relevant future plant traits, such as leaf area or biomass. A prerequisite for realistic and sharp crop image generation is the integration of multiple growth-influencing conditions in a model, such as an image of an initial growth stage, the associated growth time, and further information about the field treatment. We present a two-stage framework consisting first of an image prediction model and second of a growth estimation model, which both are independently trained. The image prediction model is a conditional Wasserstein generative adversarial network (CWGAN). In the generator of this model, conditional batch normalization (CBN) is used to integrate different conditions along with the input image. This allows the model to generate time-varying artificial images dependent on multiple influencing factors of different kinds. These images are used by the second part of the framework for plant phenotyping by deriving plant-specific traits and comparing them with those of non-artificial (real) reference images. For various crop datasets, the framework allows realistic, sharp image predictions with a slight loss of quality from short-term to long-term predictions. Simulations of varying growth-influencing conditions performed with the trained framework provide valuable insights into how such factors relate to crop appearances, which is particularly useful in complex, less explored crop mixture systems. Further results show that adding process-based simulated biomass as a condition increases the accuracy of the derived phenotypic traits from the predicted images. This demonstrates the potential of our framework to serve as an interface between an image- and process-based crop growth model.

Data-driven Crop Growth Simulation on Time-varying Generated Images using Multi-conditional Generative Adversarial Networks

TL;DR

The paper introduces a two-stage data-driven crop growth framework that first generates time-varying plant images using a multi-conditional CWGAN-GP and then derives plant traits (PLA or biomass) from the generated images via dedicated growth-estimation models. By integrating multiple conditioned variables (time, treatment, and simulated biomass) through conditional batch normalization, the approach produces realistic temporal image sequences across three crop systems and enables data-driven simulations of management changes. Growth estimates derived from generated images are shown to be sufficiently accurate to evaluate and compare against reference process-based models, demonstrating the potential to link image-based phenotyping with process-based crop growth insights. Transferability to a new site faces challenges due to style and spectral differences, highlighting the need for site-aware conditioning or expanded multi-site training to improve generalization and applicability in heterogeneous agricultural settings.

Abstract

Image-based crop growth modeling can substantially contribute to precision agriculture by revealing spatial crop development over time, which allows an early and location-specific estimation of relevant future plant traits, such as leaf area or biomass. A prerequisite for realistic and sharp crop image generation is the integration of multiple growth-influencing conditions in a model, such as an image of an initial growth stage, the associated growth time, and further information about the field treatment. We present a two-stage framework consisting first of an image prediction model and second of a growth estimation model, which both are independently trained. The image prediction model is a conditional Wasserstein generative adversarial network (CWGAN). In the generator of this model, conditional batch normalization (CBN) is used to integrate different conditions along with the input image. This allows the model to generate time-varying artificial images dependent on multiple influencing factors of different kinds. These images are used by the second part of the framework for plant phenotyping by deriving plant-specific traits and comparing them with those of non-artificial (real) reference images. For various crop datasets, the framework allows realistic, sharp image predictions with a slight loss of quality from short-term to long-term predictions. Simulations of varying growth-influencing conditions performed with the trained framework provide valuable insights into how such factors relate to crop appearances, which is particularly useful in complex, less explored crop mixture systems. Further results show that adding process-based simulated biomass as a condition increases the accuracy of the derived phenotypic traits from the predicted images. This demonstrates the potential of our framework to serve as an interface between an image- and process-based crop growth model.
Paper Structure (23 sections, 5 equations, 16 figures, 7 tables)

This paper contains 23 sections, 5 equations, 16 figures, 7 tables.

Figures (16)

  • Figure 1: Proposed two-step crop growth simulation framework: In the first step of image prediction, an input image is initially encoded with its associated time (t) and treatment (trt). Then, this encoded representation can be decoded into newly generated images with varying growth stages for different simulation times and treatments. In the second step of growth estimation, target parameters such as projected leaf area or biomass are estimated from the images and analyzed over time. Both models are trained independently.
  • Figure 2: Example evolution over time of one plant resp. from each of the datasets (a) Arabidopsis, (b) GrowliFlower, (c) Mixed-CKA, and (d) Mixed-WG visualized by georeferenced clips from RGB orthophotos. The number above the images indicates the growth stage for (a),(c), and (d) in days after sowing [DAS] and for (b) in days after planting [DAP].
  • Figure 3: Scatter results of dried biomass estimation from Mixed-CKA imagery over all growth stages and all treatments (mixtures and monocultural fields) split up in spring wheat (SW) and faba bean (FB) for Mixed-CKA.
  • Figure 4: Time-varying image prediction for Arabidopsis with, in the top row, reference images with an early growth stage as input (cyan frame), in the second row, all day-wise generated predictions, and, in the third row, standard deviation images over 10 predictions with different noise input $z$ and otherwise constant input conditions. The two bottom rows have the quality metrics: learned perceptual image patch similarity (LPIPS), multiscale structural similarity (MS-SSIM), and the projected leaf area difference ($\Delta$PLA)
  • Figure 5: Time-varying image prediction for GrowliFlower with, in the top row, reference images with an early growth stage as input (cyan frame), in the second row, all day-wise generated predictions, and, in the third row, standard deviation images over 10 predictions with different noise input $z$ and otherwise constant input conditions. The two bottom rows show the quality metrics: learned perceptual image patch similarity (LPIPS), multiscale structural similarity (MS-SSIM), and the projected leaf area difference ($\Delta$PLA)
  • ...and 11 more figures