VITA: Variational Pretraining of Transformers for Climate-Robust Crop Yield Forecasting
Adib Hasan, Mardavij Roozbehani, Munther Dahleh
TL;DR
VITA tackles climate-robust crop yield forecasting under data asymmetry by pretraining a Transformer encoder on rich satellite-based weather data and transferring to ground-based, limited-weather settings. It uses a decoder-free variational objective with a seasonality-aware sinusoidal prior to learn latent atmospheric representations and then fine-tunes with limited weather statistics and past yields. Across 763 US Corn Belt counties, VITA achieves state-of-the-art performance, especially in extreme years, with strong cross-regional transfer and data-efficiency advantages over larger foundational models. The approach offers practical, scalable deployment using public data, enabling improved risk management and resilience in climate-impacted agriculture.
Abstract
Accurate crop yield forecasting is essential for global food security. However, current AI models systematically underperform when yields deviate from historical trends. We attribute this to the lack of rich, physically grounded datasets directly linking atmospheric states to yields. To address this, we introduce VITA (Variational Inference Transformer for Asymmetric Data), a variational pretraining framework that learns representations from large satellite-based weather datasets and transfers to the ground-based limited measurements available for yield prediction. VITA is trained using detailed meteorological variables as proxy targets during pretraining and learns to predict latent atmospheric states under a seasonality-aware sinusoidal prior. This allows the model to be fine-tuned using limited weather statistics during deployment. Applied to 763 counties in the US Corn Belt, VITA achieves state-of-the-art performance in predicting corn and soybean yields across all evaluation scenarios, particularly during extreme years, with statistically significant improvements (paired t-test, p < 0.0001). Importantly, VITA outperforms prior frameworks like GNN-RNN without soil data, and larger foundational models (e.g., Chronos-Bolt) with less compute, making it practical for real-world use, especially in data-scarce regions. This work highlights how domain-aware AI design can overcome data limitations and support resilient agricultural forecasting in a changing climate.
