Contrastive Heliophysical Image Pretraining for Solar Dynamics Observatory Records
Shiyu Shen, Zhe Gao, Taifeng Chai, Yang Huang, Bin Pan
TL;DR
SolarCHIP addresses the need for domain-adapted visual backbones for multi-instrument SDO data by incorporating cross-modal, time- and location-aware contrastive objectives. The method trains both CNN- and ViT-based autoencoders with three complementary losses—global class-level, patch-level, and intra-sample contrastive learning—while reconstructing inputs. It achieves state-of-the-art performance on cross-modal translation between HMI and AIA passbands and on full-disk flare classification, especially in low-resource regimes, and provides pretrained weights and code to the community. The work offers a practical, reusable foundation for diverse solar-imaging tasks, reducing data and compute demands and enabling label-efficient analysis.
Abstract
Deep learning has revolutionized solar image analysis, yet most approaches train task-specific encoders from scratch or rely on natural-image pretraining that ignores the unique characteristics of Solar Dynamics Observatory (SDO) data. We introduce SolarCHIP, a family of contrastively pretrained visual backbones tailored to multi-instrument SDO observations. SolarCHIP addresses three key challenges in solar imaging: multimodal sensing across AIA and HMI instruments, weak inter-class separability due to slow temporal evolution, and strong intra-class variability with sparse activity signals. Our pretraining framework employs a multi-granularity contrastive objective that jointly aligns (1) global class tokens across co-temporal AIA-HMI pairs to enhance temporal discrimination, (2) local patch tokens at fixed spatial indices to enforce position-consistent, modality-invariant features, and (3) intra-sample patches across different spatial locations to preserve fine-grained spatial structure. We train both CNN- and Vision Transformer-based autoencoders and demonstrate their effectiveness on two downstream tasks: cross-modal translation between HMI and AIA passbands via ControlNet, and full-disk flare classification. Experimental results show that SolarCHIP achieves state-of-the-art performance across both tasks, with particularly strong gains in low-resource settings where labeled data is limited. Ablation studies confirm that each contrastive component contributes essential discriminative capacity at different granularities. By publicly releasing pretrained weights and training code, we provide the heliophysics community with a practical, plug-and-play feature extractor that reduces computational requirements, improves label efficiency, and establishes a reusable foundation for diverse solar imaging applications.
