Table of Contents
Fetching ...

Bridging the Gap Between Climate Science and Machine Learning in Climate Model Emulation

Luca Schmidt, Nina Effenberger

Abstract

While climate models provide insights for climate decision-making, their use is constrained by significant computational and technical demands. Although machine learning (ML) emulators offer a way to bypass the high computational costs, their effective use remains challenging. The hurdles are diverse, ranging from limited accessibility and a lack of specialized knowledge to a general mistrust of ML methods that are perceived as insufficiently physical. Here, we introduce a framework to overcome these barriers by integrating both climate science and machine learning perspectives. We find that designing easy-to-adopt emulators that address a clearly defined task and demonstrating their reliability offers a promising path for bridging the gap between our two fields.

Bridging the Gap Between Climate Science and Machine Learning in Climate Model Emulation

Abstract

While climate models provide insights for climate decision-making, their use is constrained by significant computational and technical demands. Although machine learning (ML) emulators offer a way to bypass the high computational costs, their effective use remains challenging. The hurdles are diverse, ranging from limited accessibility and a lack of specialized knowledge to a general mistrust of ML methods that are perceived as insufficiently physical. Here, we introduce a framework to overcome these barriers by integrating both climate science and machine learning perspectives. We find that designing easy-to-adopt emulators that address a clearly defined task and demonstrating their reliability offers a promising path for bridging the gap between our two fields.
Paper Structure (15 sections, 3 figures)

This paper contains 15 sections, 3 figures.

Figures (3)

  • Figure 1: Climate model emulation. Climate models (CMs) use physical equations to transform input data (top left) into consistent output (top right). These outputs can in turn serve as inputs to subsequent modeling steps (dashed arrow). Data-driven emulators can replace such model components or steps, but typically require paired climate model simulations for training. In addition to classic climate model behavior, emulators can integrate bias correction or extreme event simulation directly into the modeling pipeline.
  • Figure 2: Combined emulator workflow. The top row shows the typical path of a research project in ML, the bottom row shows applied research. The goal of emulator ML research is to develop and validate an emulator to generate data. Research in CS typically centers around analyzing and validating such data. For this purpose, emulated data can be used but they are usually not necessary.
  • Figure A.1: Check list for emulator development