Physical Consistency of Aurora's Encoder: A Quantitative Study
Benjamin Richards, Pushpa Kumar Balan
TL;DR
The paper tackles the opacity of Aurora by testing whether its latent encoder preserves physically meaningful meteorological information. It uses linear probes to decode three concepts—land–sea boundary, extreme temperatures, and atmospheric instability—from embeddings derived from ERA5 data, with dew point and K-index used as instability proxies. The results show near-perfect land–sea separability, and stronger but less complete encoding for extreme events as severity increases, highlighting both the promise and limits of current representation learning for weather tasks. The work underscores the value of interpretability tools for validating AI-driven weather forecasting in high-stakes settings.
Abstract
The high accuracy of large-scale weather forecasting models like Aurora is often accompanied by a lack of transparency, as their internal representations remain largely opaque. This "black box" nature hinders their adoption in high-stakes operational settings. In this work, we probe the physical consistency of Aurora's encoder by investigating whether its latent representations align with known physical and meteorological concepts. Using a large-scale dataset of embeddings, we train linear classifiers to identify three distinct concepts: the fundamental land-sea boundary, high-impact extreme temperature events, and atmospheric instability. Our findings provide quantitative evidence that Aurora learns physically consistent features, while also highlighting its limitations in capturing the rarest events. This work underscores the critical need for interpretability methods to validate and build trust in the next generation of Al-driven weather models.
