A Foundation Model for the Solar Dynamics Observatory

James Walsh; Daniel G. Gass; Raul Ramos Pollan; Paul J. Wright; Richard Galvez; Noah Kasmanoff; Jason Naradowsky; Anne Spalding; James Parr; Atılım Güneş Baydin

A Foundation Model for the Solar Dynamics Observatory

James Walsh, Daniel G. Gass, Raul Ramos Pollan, Paul J. Wright, Richard Galvez, Noah Kasmanoff, Jason Naradowsky, Anne Spalding, James Parr, Atılım Güneş Baydin

TL;DR

This paper discusses four key components: an ingestion pipeline to create machine learning ready datasets, the model architecture and training approach, resultant embeddings and fine-tunable models, and finally downstream fine-tuned applications.

Abstract

SDO-FM is a foundation model using data from NASA's Solar Dynamics Observatory (SDO) spacecraft; integrating three separate instruments to encapsulate the Sun's complex physical interactions into a multi-modal embedding space. This model can be used to streamline scientific investigations involving SDO by making the enormous datasets more computationally accessible for heliophysics research and enable investigations that require instrument fusion. We discuss four key components: an ingestion pipeline to create machine learning ready datasets, the model architecture and training approach, resultant embeddings and fine-tunable models, and finally downstream fine-tuned applications. A key component of this effort has been to include subject matter specialists at each stage of development; reviewing the scientific value and providing guidance for model architecture, dataset, and training paradigm decisions. This paper marks release of our pretrained models and embedding datasets, available to the community on Hugging Face and sdofm.org.

A Foundation Model for the Solar Dynamics Observatory

TL;DR

Abstract

Paper Structure (18 sections, 9 figures, 1 table)

This paper contains 18 sections, 9 figures, 1 table.

Introduction
Input Data
Scientific Direction of SDO-FM
Related Work
Method
Model choice
Solar-aware Masked Autoencoder
Nouveau-VAE
Scientific Validation Cases
Predict F10.7
Virtual EVE
Missing Channel Reconstruction
Autocalibration
Results
Reconstruction
...and 3 more sections

Figures (9)

Figure 1: Two methods of using the pre-trained backbone, directly with an adaptor for fine-tuning arch_diags, or with a new model consuming the generated latent representation directly.
Figure 2: Visualizations of the samae (left) and Nouveau-VAE (right), where the samae input has been transformed for full coverage, and the original Nouveau-VAE code base expanded to enable extraction of the latent representation.
Figure 3: Missing/corrupt data reconstruction process.
Figure 4: Instrument degradation prediction, head architecture reproduced with permission, 2021AA648A53D.
Figure 5: samae reconstruction with disk transform, notice how in the circled areas peaks of some wavelengths are not captured well.
...and 4 more figures

A Foundation Model for the Solar Dynamics Observatory

TL;DR

Abstract

A Foundation Model for the Solar Dynamics Observatory

Authors

TL;DR

Abstract

Table of Contents

Figures (9)