Table of Contents
Fetching ...

Oceananigans.jl: A Julia library that achieves breakthrough resolution, memory and energy efficiency in global ocean simulations

Simone Silvestri, Gregory L. Wagner, Christopher Hill, Matin Raayai Ardakani, Johannes Blaschke, Jean-Michel Campin, Valentin Churavy, Navid C. Constantinou, Alan Edelman, John Marshall, Ali Ramadhan, Andre Souza, Raffaele Ferrari

TL;DR

Oceananigans.jl presents a GPU-optimized, from-scratch ocean model written in Julia that delivers unprecedented resolution and energy efficiency. Through kernel-fused GPU kernels, adaptive WENO advection, and a barotropic solver optimized for parallel communication, it achieves up to 9.9 SYPD at 10 km on modest GPU resources and 0.95 SYPD at 1.7 km on larger GPU counts. The results demonstrate strong and weak scaling across realistic and aqua-planet domains, with leading energy-to-solution metrics, suggesting that routine, high-resolution ocean simulations are now practical within IPCC-class workflows. This work promises substantial gains in climate prediction accuracy and ensemble capability while dramatically reducing computational resource usage and energy consumption.

Abstract

Climate models must simulate hundreds of future scenarios for hundreds of years at coarse resolutions, and a handful of high-resolution decadal simulations to resolve localized extreme events. Using Oceananigans.jl, written from scratch in Julia, we report several achievements: First, a global ocean simulation with breakthrough horizontal resolution -- 488m -- reaching 15 simulated days per day (0.04 simulated years per day; SYPD). Second, Oceananigans simulates the global ocean at 488m with breakthrough memory efficiency on just 768 Nvidia A100 GPUs, a fraction of the resources available on current and upcoming exascale supercomputers. Third, and arguably most significant for climate modeling, Oceananigans achieves breakthrough energy efficiency reaching 0.95 SYPD at 1.7 km on 576 A100s and 9.9 SYPD at 10 km on 68 A100s -- the latter representing the highest horizontal resolutions employed by current IPCC-class ocean models. Routine climate simulations with 10 km ocean components are within reach.

Oceananigans.jl: A Julia library that achieves breakthrough resolution, memory and energy efficiency in global ocean simulations

TL;DR

Oceananigans.jl presents a GPU-optimized, from-scratch ocean model written in Julia that delivers unprecedented resolution and energy efficiency. Through kernel-fused GPU kernels, adaptive WENO advection, and a barotropic solver optimized for parallel communication, it achieves up to 9.9 SYPD at 10 km on modest GPU resources and 0.95 SYPD at 1.7 km on larger GPU counts. The results demonstrate strong and weak scaling across realistic and aqua-planet domains, with leading energy-to-solution metrics, suggesting that routine, high-resolution ocean simulations are now practical within IPCC-class workflows. This work promises substantial gains in climate prediction accuracy and ensemble capability while dramatically reducing computational resource usage and energy consumption.

Abstract

Climate models must simulate hundreds of future scenarios for hundreds of years at coarse resolutions, and a handful of high-resolution decadal simulations to resolve localized extreme events. Using Oceananigans.jl, written from scratch in Julia, we report several achievements: First, a global ocean simulation with breakthrough horizontal resolution -- 488m -- reaching 15 simulated days per day (0.04 simulated years per day; SYPD). Second, Oceananigans simulates the global ocean at 488m with breakthrough memory efficiency on just 768 Nvidia A100 GPUs, a fraction of the resources available on current and upcoming exascale supercomputers. Third, and arguably most significant for climate modeling, Oceananigans achieves breakthrough energy efficiency reaching 0.95 SYPD at 1.7 km on 576 A100s and 9.9 SYPD at 10 km on 68 A100s -- the latter representing the highest horizontal resolutions employed by current IPCC-class ocean models. Routine climate simulations with 10 km ocean components are within reach.
Paper Structure (14 sections, 1 equation, 5 figures, 1 table)

This paper contains 14 sections, 1 equation, 5 figures, 1 table.

Figures (5)

  • Figure 1: Simulated years computed by a megawatt-hour of energy (SWPMWh) versus number of grid points for state-of-the-art atmosphere and ocean models. Stars show the performance of our ocean model in a realistic and "aqua planet" (AP) setup.
  • Figure 2: Left: time-stepping sequence. Right: different domains over which 2D fast and 3D slow mode updates take place (here assuming 1 barotropic substep per baroclinic step -- halo region of size 1 -- and second-order methods -- outer region of size 1)
  • Figure 3: Vertical vorticity as simulated by Oceananigans12 (top left) and Oceananigans48 (bottom left) after a one year integration on September 1st. To the right, insets zoom on particularly energetic current systems: the Aghulas and the East Australian Currents. While major ocean currents with widths of 10-100 km are resolved in both simulations, the sharp density fronts and associated currents that develop at the ocean surface in winter at scales between 1-10 km (the ocean weather) are only resolved by Oceananigans48. On September 1 --- spring in the southern hemisphere, fall in the northern hemisphere --- such sharp frontal features populate the southern ocean but are suppressed in the north.
  • Figure 4: Strong scaling tests for the realistic setups OceananigansR12 ($1/12^\circ$), OceananigansR24 ($1/24^\circ$), and OceananigansR48 ($1/48^\circ$). The left plot reports simulated years per wall clock day (SYPD) while the right plot wall clock milliseconds per time steps. All results are averaged over 1500 time steps.
  • Figure 5: Weak scaling tests performed in double precision with the OceananigansAP setup. Each GPU has a grid equivalent to a global $1/6^\circ$ and 100 vertical layers. The weak scaling is performed up to a horizontal resolution of 1/168${}^{\rm{th}}$ of a degree ($\sim$488 m resolution) where we achieve 15 simulated days per wall clock day (1 year in roughly 25 days). The star marks the performance of OceananigansR48 (figure \ref{['fig:submesoscale-ocean']}) on 144 Perlmutter GPU nodes. All results are averaged over 500 time steps.