Comparing Data-Driven and Mechanistic Models for Predicting Phenology in Deciduous Broadleaf Forests
Christian Reimers, David Hafezi Rachti, Guahua Liu, Alexander J. Winkler
TL;DR
This work evaluates a data-driven hybrid approach to phenology prediction in deciduous broadleaf forests by forecasting the green chromatic coordinate ($GCC$) from meteorological time series using wavelet-transformed inputs and a ResNet-152 ensemble. Targets are derived from PhenoCam, and the model predicts daily GCC values across a year along with phenology markers such as start of season ($SoS$) and end of season ($EoS$), aiming to replace at least part of the phenology component in land surface models. The data-driven method outperforms two mechanistic models for GCC and $SoS$, and interpretability analyses show reliance on long-timescale climate features rather than immediate weather events, though $EoS$ remains challenging due to data and site heterogeneity. The results highlight the potential of hybrid data-driven approaches to improve climate-related phenology predictions while underscoring the need for multi-source data and robust normalization across sites.
Abstract
Understanding the future climate is crucial for informed policy decisions on climate change prevention and mitigation. Earth system models play an important role in predicting future climate, requiring accurate representation of complex sub-processes that span multiple time scales and spatial scales. One such process that links seasonal and interannual climate variability to cyclical biological events is tree phenology in deciduous broadleaf forests. Phenological dates, such as the start and end of the growing season, are critical for understanding the exchange of carbon and water between the biosphere and the atmosphere. Mechanistic prediction of these dates is challenging. Hybrid modelling, which integrates data-driven approaches into complex models, offers a solution. In this work, as a first step towards this goal, train a deep neural network to predict a phenological index from meteorological time series. We find that this approach outperforms traditional process-based models. This highlights the potential of data-driven methods to improve climate predictions. We also analyze which variables and aspects of the time series influence the predicted onset of the season, in order to gain a better understanding of the advantages and limitations of our model.
