Table of Contents
Fetching ...

VisRec: A Semi-Supervised Approach to Radio Interferometric Data Reconstruction

Ruoqi Wang, Haitao Wang, Qiong Luo, Feng Wang, Hejun Wu

TL;DR

VisRec tackles the challenge of reconstructing sparse, noisy radio interferometric visibilities under limited labeled data. It introduces a model-agnostic semi-supervised framework that merges a supervised augmentation-based reconstruction with an unsupervised consistency training mechanism, enabling effective use of unlabeled visibilities. The key contributions are a dual augmentation strategy (label-invariant and label-variant) plus a corruption-based consistency objective and a simple, effective semi-supervised loss: $\mathcal{L}_{total} = \mathcal{L}_{sup} + \lambda \mathcal{L}_{cons}$. Empirically, VisRec outperforms traditional CLEAN and state-of-the-art supervised/self-supervised baselines in reconstruction quality, robustness to observation perturbations, and cross-telescope generalization (e.g., between EHT and VLBA), signaling strong practical impact for radio astronomy imaging.

Abstract

Radio telescopes produce visibility data about celestial objects, but these data are sparse and noisy. As a result, images created on raw visibility data are of low quality. Recent studies have used deep learning models to reconstruct visibility data to get cleaner images. However, these methods rely on a substantial amount of labeled training data, which requires significant labeling effort from radio astronomers. Addressing this challenge, we propose VisRec, a model-agnostic semi-supervised learning approach to the reconstruction of visibility data. Specifically, VisRec consists of both a supervised learning module and an unsupervised learning module. In the supervised learning module, we introduce a set of data augmentation functions to produce diverse training examples. In comparison, the unsupervised learning module in VisRec augments unlabeled data and uses reconstructions from non-augmented visibility data as pseudo-labels for training. This hybrid approach allows VisRec to effectively leverage both labeled and unlabeled data. This way, VisRec performs well even when labeled data is scarce. Our evaluation results show that VisRec outperforms all baseline methods in reconstruction quality, robustness against common observation perturbation, and generalizability to different telescope configurations.

VisRec: A Semi-Supervised Approach to Radio Interferometric Data Reconstruction

TL;DR

VisRec tackles the challenge of reconstructing sparse, noisy radio interferometric visibilities under limited labeled data. It introduces a model-agnostic semi-supervised framework that merges a supervised augmentation-based reconstruction with an unsupervised consistency training mechanism, enabling effective use of unlabeled visibilities. The key contributions are a dual augmentation strategy (label-invariant and label-variant) plus a corruption-based consistency objective and a simple, effective semi-supervised loss: . Empirically, VisRec outperforms traditional CLEAN and state-of-the-art supervised/self-supervised baselines in reconstruction quality, robustness to observation perturbations, and cross-telescope generalization (e.g., between EHT and VLBA), signaling strong practical impact for radio astronomy imaging.

Abstract

Radio telescopes produce visibility data about celestial objects, but these data are sparse and noisy. As a result, images created on raw visibility data are of low quality. Recent studies have used deep learning models to reconstruct visibility data to get cleaner images. However, these methods rely on a substantial amount of labeled training data, which requires significant labeling effort from radio astronomers. Addressing this challenge, we propose VisRec, a model-agnostic semi-supervised learning approach to the reconstruction of visibility data. Specifically, VisRec consists of both a supervised learning module and an unsupervised learning module. In the supervised learning module, we introduce a set of data augmentation functions to produce diverse training examples. In comparison, the unsupervised learning module in VisRec augments unlabeled data and uses reconstructions from non-augmented visibility data as pseudo-labels for training. This hybrid approach allows VisRec to effectively leverage both labeled and unlabeled data. This way, VisRec performs well even when labeled data is scarce. Our evaluation results show that VisRec outperforms all baseline methods in reconstruction quality, robustness against common observation perturbation, and generalizability to different telescope configurations.
Paper Structure (19 sections, 10 equations, 8 figures, 3 tables, 1 algorithm)

This paper contains 19 sections, 10 equations, 8 figures, 3 tables, 1 algorithm.

Figures (8)

  • Figure 1: Illustration of radio interferometric data processing. The telescopes collect visibility data. The imaging results of the raw data is dominated by artifacts, called dirty images. In contrast, the imaging results of sparse-to-dense reconstructed visibility data are cleaner.
  • Figure 2: Overview of our method. In our semi-supervised framework, labeled data undergo supervised training, as depicted by the blue arrows. The labeled data are augmented with label-variant and label-invariant augmentations $T_{var}$ and $T_{inv}$. Then the neural network, denoted by $f_{\theta}$, processes augmented visibility data to produce reconstructions. These reconstructions are then compared against ground-truth references to compute supervised loss $\mathcal{L}_{sup}$. Unlabeled sparse visibility data are augmented by $T_{corr}$, and the same $f_{\theta}$ reconstructs both the non-augmented and augmented visibility data. The reconstruction from the non-augmented unlabeled data serves as a pseudo-label for that with augmentations to compute the consistency loss $\mathcal{L}_{cons}$. The overall loss combines supervised and consistency losses in a weighted sum: $\mathcal{L}_{total} \gets \mathcal{L}_{sup} + \lambda \mathcal{L}_{cons}$.
  • Figure 3: Visual Examples of overall comparison. VwoSA denotes VisRec w/o Sup-Aug.
  • Figure 4: Effect of Labeled Data Size.
  • Figure 5: Performance of different methods across various noise levels. We only show PSNR and SSIM values because LFD is not applicable to Dirty, CLEAN, and Noise2Astro.
  • ...and 3 more figures