Table of Contents
Fetching ...

PlantTraitNet: An Uncertainty-Aware Multimodal Framework for Global-Scale Plant Trait Inference from Citizen Science Data

Ayushi Sharma, Johanna Trost, Daniel Lusk, Johannes Dollinger, Julian Schrader, Christian Rossi, Javier Lopatin, Etienne Laliberté, Simon Haberstroh, Jana Eichel, Daniel Mederer, Jose Miguel Cerda-Paredes, Shyam S. Phartyal, Lisa-Maricia Schwarz, Anja Linstädter, Maria Conceição Caldeira, Teja Kattenborn

TL;DR

PlantTraitNet addresses the global gap in plant trait data by uniting weakly supervised citizen-science imagery with uncertainty-aware, multimodal learning to predict four key traits (height, leaf area, specific leaf area, and leaf nitrogen). It integrates image features, monocular depth priors, and geospatial context via geospatial foundation models to generate 1-degree global trait maps, validated against sPlotOpen and surpassing prior products. An uncertainty-guided data cleaning loop reduces label noise and enables robust generalization, while multi-task learning captures trait interdependencies and improves efficiency. The approach demonstrates scalability, captures within-species variability, and offers a valuable tool for ecological research and Earth-system modeling.

Abstract

Global plant maps of plant traits, such as leaf nitrogen or plant height, are essential for understanding ecosystem processes, including the carbon and energy cycles of the Earth system. However, existing trait maps remain limited by the high cost and sparse geographic coverage of field-based measurements. Citizen science initiatives offer a largely untapped resource to overcome these limitations, with over 50 million geotagged plant photographs worldwide capturing valuable visual information on plant morphology and physiology. In this study, we introduce PlantTraitNet, a multi-modal, multi-task uncertainty-aware deep learning framework that predictsfour key plant traits (plant height, leaf area, specific leaf area, and nitrogen content) from citizen science photos using weak supervision. By aggregating individual trait predictions across space, we generate global maps of trait distributions. We validate these maps against independent vegetation survey data (sPlotOpen) and benchmark them against leading global trait products. Our results show that PlantTraitNet consistently outperforms existing trait maps across all evaluated traits, demonstrating that citizen science imagery, when integrated with computer vision and geospatial AI, enables not only scalable but also more accurate global trait mapping. This approach offers a powerful new pathway for ecological research and Earth system modeling.

PlantTraitNet: An Uncertainty-Aware Multimodal Framework for Global-Scale Plant Trait Inference from Citizen Science Data

TL;DR

PlantTraitNet addresses the global gap in plant trait data by uniting weakly supervised citizen-science imagery with uncertainty-aware, multimodal learning to predict four key traits (height, leaf area, specific leaf area, and leaf nitrogen). It integrates image features, monocular depth priors, and geospatial context via geospatial foundation models to generate 1-degree global trait maps, validated against sPlotOpen and surpassing prior products. An uncertainty-guided data cleaning loop reduces label noise and enables robust generalization, while multi-task learning captures trait interdependencies and improves efficiency. The approach demonstrates scalability, captures within-species variability, and offers a valuable tool for ecological research and Earth-system modeling.

Abstract

Global plant maps of plant traits, such as leaf nitrogen or plant height, are essential for understanding ecosystem processes, including the carbon and energy cycles of the Earth system. However, existing trait maps remain limited by the high cost and sparse geographic coverage of field-based measurements. Citizen science initiatives offer a largely untapped resource to overcome these limitations, with over 50 million geotagged plant photographs worldwide capturing valuable visual information on plant morphology and physiology. In this study, we introduce PlantTraitNet, a multi-modal, multi-task uncertainty-aware deep learning framework that predictsfour key plant traits (plant height, leaf area, specific leaf area, and nitrogen content) from citizen science photos using weak supervision. By aggregating individual trait predictions across space, we generate global maps of trait distributions. We validate these maps against independent vegetation survey data (sPlotOpen) and benchmark them against leading global trait products. Our results show that PlantTraitNet consistently outperforms existing trait maps across all evaluated traits, demonstrating that citizen science imagery, when integrated with computer vision and geospatial AI, enables not only scalable but also more accurate global trait mapping. This approach offers a powerful new pathway for ecological research and Earth system modeling.

Paper Structure

This paper contains 38 sections, 2 equations, 18 figures, 8 tables.

Figures (18)

  • Figure 1: Geographic coverage of the citizen science data (top) and independent benchmark reference data (bottom) from vegetation surveys (sPlotOpen, sabatini2021splotopen).
  • Figure 2: Randomly sampled images showing highest/lowest predictive uncertainty (see Methodology). Observations: Height uncertainty often from unsuitable contexts (winter scenes, fruits, hands). SLA uncertainty from images lacking visible leaves (bare branches, flowers, buds). Leaf Nitrogen: low-quality/blurry images. Leaf Area: exotic leaf types (e.g., ferns).
  • Figure 3: The model integrates image, depth, and geospatial embeddings. These are fused within a multimodal backbone, which then uses individual heads to predict height (H), leaf area (LA), specific leaf area (SLA), and leaf nitrogen (LN).
  • Figure 4: Overview of the pipeline. We filter weakly labeled citizen science data (Raw data) based on high model uncertainty (Step 1) and large residuals from species trait medians (Step 2). We use this refined data for training the models (Step 3), which are evaluated by comparing spatially aggregated predictions (1° resolution) against overlapping vegetation surveys (sPlotOpen)
  • Figure 5: Mean relative prediction error (MRPE) computed on validation data at the family level, visualized along the taxonomic tree, for height (H), leaf area (LA), specific leaf area (SLA) and leaf nitrogen (LN).
  • ...and 13 more figures