Table of Contents
Fetching ...

Towards Mitigating Systematics in Large-Scale Surveys via Few-Shot Optimal Transport-Based Feature Alignment

Sultan Hassan, Sambatra Andrianomena, Benjamin D. Wandelt

TL;DR

This work addresses distribution shifts caused by poorly understood systematics in large-scale surveys by proposing a few-shot feature-alignment framework that harmonizes OOD representations with those from a pre-trained ID model. The method trains a second copy of the model on OOD data and minimizes an alignment objective, comparing Mean Squared Error and Optimal Transport losses; the OT objective explicitly handles unpaired samples via a transport plan $\gamma$ with cost $c(z_i^{\rm ID}, z_j^{\rm OOD})$ and constraint $\gamma \in \Pi(\mu, \nu)$. Empirical results on MNIST and CAMELS HI maps show that OT yields robust OOD generalization, achieving substantial improvements after minimal optimization steps in paired and unpaired settings, including few-shot scenarios, while MSE struggles without explicit pairing. These findings suggest OT-based feature alignment as a practical tool for reliable inferences under domain shifts in upcoming surveys, with broad applicability to other scientific domains.

Abstract

Systematics contaminate observables, leading to distribution shifts relative to theoretically simulated signals-posing a major challenge for using pre-trained models to label such observables. Since systematics are often poorly understood and difficult to model, removing them directly and entirely may not be feasible. To address this challenge, we propose a novel method that aligns learned features between in-distribution (ID) and out-of-distribution (OOD) samples by optimizing a feature-alignment loss on the representations extracted from a pre-trained ID model. We first experimentally validate the method on the MNIST dataset using possible alignment losses, including mean squared error and optimal transport, and subsequently apply it to large-scale maps of neutral hydrogen. Our results show that optimal transport is particularly effective at aligning OOD features when parity between ID and OOD samples is unknown, even with limited data-mimicking real-world conditions in extracting information from large-scale surveys. Our code is available at https://github.com/sultan-hassan/feature-alignment-for-OOD-generalization.

Towards Mitigating Systematics in Large-Scale Surveys via Few-Shot Optimal Transport-Based Feature Alignment

TL;DR

This work addresses distribution shifts caused by poorly understood systematics in large-scale surveys by proposing a few-shot feature-alignment framework that harmonizes OOD representations with those from a pre-trained ID model. The method trains a second copy of the model on OOD data and minimizes an alignment objective, comparing Mean Squared Error and Optimal Transport losses; the OT objective explicitly handles unpaired samples via a transport plan with cost and constraint . Empirical results on MNIST and CAMELS HI maps show that OT yields robust OOD generalization, achieving substantial improvements after minimal optimization steps in paired and unpaired settings, including few-shot scenarios, while MSE struggles without explicit pairing. These findings suggest OT-based feature alignment as a practical tool for reliable inferences under domain shifts in upcoming surveys, with broad applicability to other scientific domains.

Abstract

Systematics contaminate observables, leading to distribution shifts relative to theoretically simulated signals-posing a major challenge for using pre-trained models to label such observables. Since systematics are often poorly understood and difficult to model, removing them directly and entirely may not be feasible. To address this challenge, we propose a novel method that aligns learned features between in-distribution (ID) and out-of-distribution (OOD) samples by optimizing a feature-alignment loss on the representations extracted from a pre-trained ID model. We first experimentally validate the method on the MNIST dataset using possible alignment losses, including mean squared error and optimal transport, and subsequently apply it to large-scale maps of neutral hydrogen. Our results show that optimal transport is particularly effective at aligning OOD features when parity between ID and OOD samples is unknown, even with limited data-mimicking real-world conditions in extracting information from large-scale surveys. Our code is available at https://github.com/sultan-hassan/feature-alignment-for-OOD-generalization.

Paper Structure

This paper contains 7 sections, 1 equation, 3 figures, 1 table, 1 algorithm.

Figures (3)

  • Figure 1: Schematic view of the steps used to mitigate the impact of systematics on labeling observables. The first phase involves pre-training on ID samples, followed by a second phase where the pre-trained model is optimized using a feature alignment loss between ID and OOD samples.
  • Figure 2: OOD classification accuracy evolution after feature alignment using mean squared error (MSE) and optimal transport (OT). At the beginning of the feature alignment training, the pre-trained model has very high accuracy on the IDs (0.985) but only random-guess–level performance on the OODs (0.5). This OOD accuracy of 0.5 serves as the starting point at epoch 0. Significant improvement is achieved after a single backpropagation step in the many-sample case with OT, whereas much more optimization steps ($\sim$ 20-40) are required in the few-sample case. In all cases, feature alignment with OT improves OOD accuracy without using any labels.
  • Figure 3: A random realization of a 256$\times$256 HI map at redshift $z = 0$ from the CAMELS dataset. The out-of-distribution (OOD) sample is generated by adding white noise, mimicking the effect of thermal noise in radio interferometric observations.