UFLUX v2.0: A Process-Informed Machine Learning Framework for Efficient and Explainable Modelling of Terrestrial Carbon Uptake

Wenquan Dong; Songyan Zhu; Jian Xu; Casey M. Ryan; Man Chen; Jingya Zeng; Hao Yu; Congfeng Cao; Jiancheng Shi

UFLUX v2.0: A Process-Informed Machine Learning Framework for Efficient and Explainable Modelling of Terrestrial Carbon Uptake

Wenquan Dong, Songyan Zhu, Jian Xu, Casey M. Ryan, Man Chen, Jingya Zeng, Hao Yu, Congfeng Cao, Jiancheng Shi

TL;DR

The paper tackles biases in process-based GPP estimates caused by model simplifications and limited EC data coverage. It introduces UFLUX v2.0, a bias-correction framework that marries a mechanistic LUE-based process model with an ML bias learner trained on EC measurements, validated via 5-fold cross-validation across FLUXNET2015 sites. Results show a substantial improvement in tower-level GPP prediction (R^2 rising from 0.51 to 0.79 and RMSE dropping from 3.09 to 1.60 g C m^-2 d^-1), with pronounced gains in forest ecosystems; global GPP totals remain similar to the process-based approach but exhibit notable latitudinal distribution differences. The approach enhances cross-ecosystem adaptability and provides deeper insights into how environmental changes may reshuffle terrestrial carbon uptake, improving large-scale carbon cycle assessments.

Abstract

Gross Primary Productivity (GPP), the amount of carbon plants fixed by photosynthesis, is pivotal for understanding the global carbon cycle and ecosystem functioning. Process-based models built on the knowledge of ecological processes are susceptible to biases stemming from their assumptions and approximations. These limitations potentially result in considerable uncertainties in global GPP estimation, which may pose significant challenges to our Net Zero goals. This study presents UFLUX v2.0, a process-informed model that integrates state-of-art ecological knowledge and advanced machine learning techniques to reduce uncertainties in GPP estimation by learning the biases between process-based models and eddy covariance (EC) measurements. In our findings, UFLUX v2.0 demonstrated a substantial improvement in model accuracy, achieving an R^2 of 0.79 with a reduced RMSE of 1.60 g C m^-2 d^-1, compared to the process-based model's R^2 of 0.51 and RMSE of 3.09 g C m^-2 d^-1. Our global GPP distribution analysis indicates that while UFLUX v2.0 and the process-based model achieved similar global total GPP (137.47 Pg C and 132.23 Pg C, respectively), they exhibited large differences in spatial distribution, particularly in latitudinal gradients. These differences are very likely due to systematic biases in the process-based model and differing sensitivities to climate and environmental conditions. This study offers improved adaptability for GPP modelling across diverse ecosystems, and further enhances our understanding of global carbon cycles and its responses to environmental changes.

UFLUX v2.0: A Process-Informed Machine Learning Framework for Efficient and Explainable Modelling of Terrestrial Carbon Uptake

TL;DR

Abstract

Paper Structure (12 sections, 6 equations, 3 figures, 1 table)

This paper contains 12 sections, 6 equations, 3 figures, 1 table.

Introduction
Methodology
UFLUX v2.0
Process-based model
Data
Eddy Covariance
Remote sensing data
climate reanalysis
Results and Discussion
Tower-Level Validation
Global distribution of GPP
Summary and Conclusion

Figures (3)

Figure 1: Schematic workflow of the UFLUX v2.0 framework for GPP estimation. The upper panel shows the process-based model component, integrating satellite and climate data. The lower panel illustrates the UFLUX v2.0 enhancement, incorporating machine learning for adaptive bias correction using eddy covariance measurements, resulting in improved global GPP estimates.
Figure 2: Comparison of modeled GPP estimates against EC GPP measurements at the tower level. (a) Process-based model GPP estimates versus EC GPP, and (b) UFLUX v2.0 GPP estimates versus EC GPP. The dashed line represents the 1:1 line. The color of the scatter points indicates the point density, calculated using a Gaussian kernel density estimation, with colors ranging from dark blue to yellow. Darker colors represent areas with lower data density, while lighter colors (yellow) indicate areas of higher data density.
Figure 3: Global distribution and comparison of GPP estimates from UFLUX v2.0 and the process-based model for the year 2010. (a) Global GPP estimated by UFLUX v2.0. (b) Global GPP estimated by the process-based model. (c) Longitudinal distribution of GPP for both models. (d) Percentage difference in GPP estimates between UFLUX v2.0 and the process-based model, calculated as (UFLUX v2.0 - Process-based model) / UFLUX v2.0 × 100%. (e) Latitudinal distribution of GPP for both models.

UFLUX v2.0: A Process-Informed Machine Learning Framework for Efficient and Explainable Modelling of Terrestrial Carbon Uptake

TL;DR

Abstract

UFLUX v2.0: A Process-Informed Machine Learning Framework for Efficient and Explainable Modelling of Terrestrial Carbon Uptake

Authors

TL;DR

Abstract

Table of Contents

Figures (3)