Multi-modal Data Fusion and Deep Ensemble Learning for Accurate Crop Yield Prediction

Akshay Dagadu Yewle; Laman Mirzayeva; Oktay Karakuş

Multi-modal Data Fusion and Deep Ensemble Learning for Accurate Crop Yield Prediction

Akshay Dagadu Yewle, Laman Mirzayeva, Oktay Karakuş

TL;DR

RicEns-Net tackles accurate rice yield prediction by fusing SAR, optical Sentinel data (Sentinel-1/2/3), and meteorological inputs. The authors develop a 4-branch deep ensemble combining DenseNet, CNN, MLP, and Autoencoder, with feature selection reducing predictors from over 100 to 15 across five modalities, and output computed as a weighted ensemble $y_{ ext{RicEns-Net}} = \sum_{i=1}^{N} w_i y_i$ with $w_i = (1/e_i)/\sum_j (1/e_j)$. On EY Open Science Challenge 2023 data from An Giang, Vietnam, RicEns-Net achieves $MAE = 341.125$ kg/ha and $RMSE = 436.258$ kg/ha, with CV $R^2$ around 0.692, outperforming baseline models. The work demonstrates the value of multimodal data fusion and deep ensembles for precision agriculture, enabling more reliable yield forecasts and resource planning, while noting cloud cover and field boundary delineation as areas for improvement.

Abstract

This study introduces RicEns-Net, a novel Deep Ensemble model designed to predict crop yields by integrating diverse data sources through multimodal data fusion techniques. The research focuses specifically on the use of synthetic aperture radar (SAR), optical remote sensing data from Sentinel 1, 2, and 3 satellites, and meteorological measurements such as surface temperature and rainfall. The initial field data for the study were acquired through Ernst & Young's (EY) Open Science Challenge 2023. The primary objective is to enhance the precision of crop yield prediction by developing a machine-learning framework capable of handling complex environmental data. A comprehensive data engineering process was employed to select the most informative features from over 100 potential predictors, reducing the set to 15 features from 5 distinct modalities. This step mitigates the ``curse of dimensionality" and enhances model performance. The RicEns-Net architecture combines multiple machine learning algorithms in a deep ensemble framework, integrating the strengths of each technique to improve predictive accuracy. Experimental results demonstrate that RicEns-Net achieves a mean absolute error (MAE) of 341 kg/Ha (roughly corresponds to 5-6\% of the lowest average yield in the region), significantly exceeding the performance of previous state-of-the-art models, including those developed during the EY challenge.

Multi-modal Data Fusion and Deep Ensemble Learning for Accurate Crop Yield Prediction

TL;DR

Abstract

Multi-modal Data Fusion and Deep Ensemble Learning for Accurate Crop Yield Prediction

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (8)