Interpretable Machine Learning for Weather and Climate Prediction: A Survey
Ruyi Yang, Jingyu Hu, Zihao Li, Jianli Mu, Tingzhao Yu, Jiangjiang Xia, Xuhong Li, Aritra Dasgupta, Haoyi Xiong
TL;DR
This survey addresses the need for interpretable machine learning in weather and climate prediction by contrasting black-box models with explanations that improve trust, debugging, and scientific insight. It categorizes explainability into post-hoc and intrinsically interpretable models, reviewing methods such as SHAP, LRP, Grad-CAM, and LIME, and discusses self-explaining architectures like linear, tree-based, and attention-based models. Key contributions include mapping explainability techniques to meteorological tasks, identifying gaps in mechanistic interpretability and evaluation benchmarks, and outlining how explanations can guide physics-informed improvements. The work underscores the practical impact of explanations for trust, scientific discovery, and the integration of data-driven methods with physical principles, especially in the context of large foundation models for weather forecasting.
Abstract
Advanced machine learning models have recently achieved high predictive accuracy for weather and climate prediction. However, these complex models often lack inherent transparency and interpretability, acting as "black boxes" that impede user trust and hinder further model improvements. As such, interpretable machine learning techniques have become crucial in enhancing the credibility and utility of weather and climate modeling. In this survey, we review current interpretable machine learning approaches applied to meteorological predictions. We categorize methods into two major paradigms: 1) Post-hoc interpretability techniques that explain pre-trained models, such as perturbation-based, game theory based, and gradient-based attribution methods. 2) Designing inherently interpretable models from scratch using architectures like tree ensembles and explainable neural networks. We summarize how each technique provides insights into the predictions, uncovering novel meteorological relationships captured by machine learning. Lastly, we discuss research challenges around achieving deeper mechanistic interpretations aligned with physical principles, developing standardized evaluation benchmarks, integrating interpretability into iterative model development workflows, and providing explainability for large foundation models.
