Table of Contents
Fetching ...

Foundation Models for Environmental Science: A Survey of Emerging Frontiers

Runlong Yu, Shengyu Chen, Yiqun Xie, Huaxiu Yao, Jared Willard, Xiaowei Jia

TL;DR

The paper surveys how foundation models can transform environmental science by unifying heterogeneous data through large-scale pre-training and multi-modal architectures. It maps applications to forward prediction, data generation, data assimilation, downscaling, inverse modeling, model ensembling, and decision-making, detailing data collection, architecture, training, tuning, and evaluation workflows. It highlights opportunities in knowledge-guided learning, active learning, benchmark datasets, and intermediate-process reasoning, while acknowledging challenges in interpretability, uncertainty, and extreme-event prediction. The work underscores the practical impact of foundation models for scalable, data-rich environmental decision-making and scientific discovery, advocating interdisciplinary collaboration to advance sustainable environmental management.

Abstract

Modeling environmental ecosystems is essential for effective resource management, sustainable development, and understanding complex ecological processes. However, traditional data-driven methods face challenges in capturing inherently complex and interconnected processes and are further constrained by limited observational data in many environmental applications. Foundation models, which leverages large-scale pre-training and universal representations of complex and heterogeneous data, offer transformative opportunities for capturing spatiotemporal dynamics and dependencies in environmental processes, and facilitate adaptation to a broad range of applications. This survey presents a comprehensive overview of foundation model applications in environmental science, highlighting advancements in common environmental use cases including forward prediction, data generation, data assimilation, downscaling, inverse modeling, model ensembling, and decision-making across domains. We also detail the process of developing these models, covering data collection, architecture design, training, tuning, and evaluation. Through discussions on these emerging methods as well as their future opportunities, we aim to promote interdisciplinary collaboration that accelerates advancements in machine learning for driving scientific discovery in addressing critical environmental challenges.

Foundation Models for Environmental Science: A Survey of Emerging Frontiers

TL;DR

The paper surveys how foundation models can transform environmental science by unifying heterogeneous data through large-scale pre-training and multi-modal architectures. It maps applications to forward prediction, data generation, data assimilation, downscaling, inverse modeling, model ensembling, and decision-making, detailing data collection, architecture, training, tuning, and evaluation workflows. It highlights opportunities in knowledge-guided learning, active learning, benchmark datasets, and intermediate-process reasoning, while acknowledging challenges in interpretability, uncertainty, and extreme-event prediction. The work underscores the practical impact of foundation models for scalable, data-rich environmental decision-making and scientific discovery, advocating interdisciplinary collaboration to advance sustainable environmental management.

Abstract

Modeling environmental ecosystems is essential for effective resource management, sustainable development, and understanding complex ecological processes. However, traditional data-driven methods face challenges in capturing inherently complex and interconnected processes and are further constrained by limited observational data in many environmental applications. Foundation models, which leverages large-scale pre-training and universal representations of complex and heterogeneous data, offer transformative opportunities for capturing spatiotemporal dynamics and dependencies in environmental processes, and facilitate adaptation to a broad range of applications. This survey presents a comprehensive overview of foundation model applications in environmental science, highlighting advancements in common environmental use cases including forward prediction, data generation, data assimilation, downscaling, inverse modeling, model ensembling, and decision-making across domains. We also detail the process of developing these models, covering data collection, architecture design, training, tuning, and evaluation. Through discussions on these emerging methods as well as their future opportunities, we aim to promote interdisciplinary collaboration that accelerates advancements in machine learning for driving scientific discovery in addressing critical environmental challenges.

Paper Structure

This paper contains 40 sections, 2 figures, 1 table.

Figures (2)

  • Figure 1: Application-centric objectives and advancements enabled by foundation models.
  • Figure 2: Model design workflow for foundation models in environmental science.