Proof-of-concept: Using ChatGPT to Translate and Modernize an Earth System Model from Fortran to Python/JAX
Anthony Zhou, Linnia Hawkins, Pierre Gentine
TL;DR
Earth system models are largely written in Fortran and lack differentiability, which limits GPU acceleration and online learning. The authors introduce a semi-automated translation workflow that divides a Fortran codebase into units, translates each unit to Python/JAX using GPT-4, and validates via unit testing, demonstrated on the leaf-level photosynthesis component of CESM CLM. The Python/JAX version enables automatic differentiation and gradient-based parameter estimation, e.g., estimating $V_{c,\max}$, and achieves substantial runtime improvements (up to $\sim100\times$ on GPU) compared with the original Fortran implementation. This work contributes an open-source translation pipeline and illustrates a feasible path toward differentiable, GPU-accelerated climate components that are more accessible to junior scientists.
Abstract
Earth system models (ESMs) are vital for understanding past, present, and future climate, but they suffer from legacy technical infrastructure. ESMs are primarily implemented in Fortran, a language that poses a high barrier of entry for early career scientists and lacks a GPU runtime, which has become essential for continued advancement as GPU power increases and CPU scaling slows. Fortran also lacks differentiability - the capacity to differentiate through numerical code - which enables hybrid models that integrate machine learning methods. Converting an ESM from Fortran to Python/JAX could resolve these issues. This work presents a semi-automated method for translating individual model components from Fortran to Python/JAX using a large language model (GPT-4). By translating the photosynthesis model from the Community Earth System Model (CESM), we demonstrate that the Python/JAX version results in up to 100x faster runtimes using GPU parallelization, and enables parameter estimation via automatic differentiation. The Python code is also easy to read and run and could be used by instructors in the classroom. This work illustrates a path towards the ultimate goal of making climate models fast, inclusive, and differentiable.
