Table of Contents
Fetching ...

J-PAS: Semi-Supervised Sim-to-Obs Transfer for Robust Star--Galaxy--Quasar Classification

Daniel López-Cano, L. Raul Abramo, L. Nakazono, I. Pérez-Ràfols, G. Martínez-Solaeche, J. Chaves-Montero, Matthew M. Pieri, Jailson Alcaniz, Narciso Benitez, Silvia Bonoli, Saulo Carneiro, Javier Cenarro, David Cristóbal-Hornillos, Simone Daflon, Renato Dupke, Alessandro Ederoclite, Rosa González Delgado, Antonio Hernán-Caballero, Carlos Hernández-Monteagudo, Jifeng Liu, Carlos López-Sanjuan, Antonio Marín-Franch, Claudia Mendes de Oliveira, Mariano Moles, Fernando Roig, Laerte Sodré, Keith Taylor, Jesús Varela, Héctor Vázquez Ramió, Jose Vilchez, Javier Zaragoza-Cardiel

TL;DR

This study shows how modest target supervision enables robust, data-efficient simulation-to-observation transfer when simulations are plentiful but target labels are scarce.

Abstract

Modern studies in astrophysics and cosmology increasingly rely on simulations and cross-survey analyses, yet differences in data generation, instrumentation, calibration, and unmodeled physics introduce distribution mismatches between datasets (``domain shift''). In machine-learning pipelines, this occurs when the joint distribution of inputs and labels differs between the training (source) and application (target) domains, causing source-trained models to underperform on the target. Transfer learning and domain adaptation provide principled ways to mitigate this effect. We study a concrete simulation-to-observation case: semi-supervised domain adaptation (SSDA) to transfer a four-class spectral classifier -- high-redshift quasars, low-redshift quasars, galaxies, and stars -- from J-PAS mock catalogs based on DESI spectra to real J-PAS observations. Our pipeline pretrains on abundant labeled DESI$\rightarrow$J-PAS mocks and adapts to the target domain using a small labeled J-PAS subset. We benchmark SSDA against two baselines: a J-PAS--only supervised model trained with the same target-label budget, and a mocks-only model evaluated on held-out J-PAS data. On this held-out J-PAS data, SSDA achieves a macro-F1 score (balancing precision and recall) of $0.82$ and an overall true positive rate of $0.89$, compared to $0.79/0.85$ for the J-PAS--only baseline and $0.73/0.87$ for the mocks-only model. The gains are driven primarily by improved quasar classification, especially in the high-redshift subclass ($\mathrm{F1}=0.66$ vs.\ $0.55/0.37$), yielding better-calibrated candidate lists for spectroscopic targeting (e.g., WEAVE-QSO) and AGN searches. This study shows how modest target supervision enables robust, data-efficient simulation-to-observation transfer when simulations are plentiful but target labels are scarce.

J-PAS: Semi-Supervised Sim-to-Obs Transfer for Robust Star--Galaxy--Quasar Classification

TL;DR

This study shows how modest target supervision enables robust, data-efficient simulation-to-observation transfer when simulations are plentiful but target labels are scarce.

Abstract

Modern studies in astrophysics and cosmology increasingly rely on simulations and cross-survey analyses, yet differences in data generation, instrumentation, calibration, and unmodeled physics introduce distribution mismatches between datasets (``domain shift''). In machine-learning pipelines, this occurs when the joint distribution of inputs and labels differs between the training (source) and application (target) domains, causing source-trained models to underperform on the target. Transfer learning and domain adaptation provide principled ways to mitigate this effect. We study a concrete simulation-to-observation case: semi-supervised domain adaptation (SSDA) to transfer a four-class spectral classifier -- high-redshift quasars, low-redshift quasars, galaxies, and stars -- from J-PAS mock catalogs based on DESI spectra to real J-PAS observations. Our pipeline pretrains on abundant labeled DESIJ-PAS mocks and adapts to the target domain using a small labeled J-PAS subset. We benchmark SSDA against two baselines: a J-PAS--only supervised model trained with the same target-label budget, and a mocks-only model evaluated on held-out J-PAS data. On this held-out J-PAS data, SSDA achieves a macro-F1 score (balancing precision and recall) of and an overall true positive rate of , compared to for the J-PAS--only baseline and for the mocks-only model. The gains are driven primarily by improved quasar classification, especially in the high-redshift subclass ( vs.\ ), yielding better-calibrated candidate lists for spectroscopic targeting (e.g., WEAVE-QSO) and AGN searches. This study shows how modest target supervision enables robust, data-efficient simulation-to-observation transfer when simulations are plentiful but target labels are scarce.
Paper Structure (17 sections, 4 equations, 7 figures, 1 table)

This paper contains 17 sections, 4 equations, 7 figures, 1 table.

Figures (7)

  • Figure 1: Representative photometric SEDs, comparing J-PAS (solid) with DESI$\rightarrow$J-PAS Mocks (dashed). Each solid/dashed pair corresponds to a distinct individual object with a one-to-one mock counterpart (i.e., the dashed curve is the mock constructed for that same source); the multiple curves shown in each panel are therefore not averaged templates but separate objects plotted together for illustration. Panels correspond to different classes. Curves span the effective 55-band set after reliability masking (Section \ref{['subsec:datasets_curation']}). For visual clarity, we display examples with $i$-band signal-to-noise ratio $>200$; this high-S/N cut is used only for visualization and is not applied elsewhere in the analysis. Within each class, the plotted objects are selected from the subset passing this visualization-only S/N cut (randomly chosen from the eligible set). For legibility, we also omit error bars. We intentionally retain bands with missing flux measurements in J-PAS (visible as gaps) to reflect real observational coverage.
  • Figure 2: Class balance in our two working datasets: full DESI$\rightarrow$J-PAS Mocks (source; dashed) versus the labeled J-PAS$\times$DESI subset (target; solid). The y-axis shows relative frequency per magnitude bin normalized by the total size of the mock catalog ($\simeq 1.5\times10^{6}$ unique sources); consequently, curves appear on the same percentage scale and trace each other in shape even though absolute counts differ widely ( $N_{\mathrm{Mocks}}\!\gg\!N_{\mathrm{J\text{-}PAS}}$; see the total per-class counts in the annotated boxes). The figure includes all objects after our data curation (Sect. \ref{['subsec:datasets_curation']}); no train/validation/test splits or magnitude cuts are applied here.
  • Figure 3: Confusion matrices for four complementary evaluations: (top left) J-PAS supervised: target-only baseline, (top right) Mocks no-DA: source-domain evaluation on the mock test split, (bottom left) J-PAS no-DA: zero-shot transfer of the mocks-trained model, and (bottom right) J-PAS DA: the same model after semi-supervised adaptation. Cells show counts; the colormap is row-normalized to aid visual comparison. Diagonals also list TPR (recall), PPV (precision), and F1 (see Eq. \ref{['eq:tpr_ppv_f1']}) for each class.
  • Figure 4: Per-class summary diagnostics.Top: F1 scores by class (radar plot) shown for all four regimes (different markers with distinct colors and linestyles). Bottom: Per-class ROC curves and AUC values; the low-FPR region (left) is most relevant for building clean follow-up target lists. Both panels reflect the same trend seen in Fig. \ref{['fig:confusion-matrices']}: SSDA boosts the quasar subclasses while leaving GALAXY/STAR essentially saturated.
  • Figure 5: Quasar class probabilities as a function of redshift (two-panel view).Top: range $0<z<5$. Bottom: zoom on the transition near $z\simeq2.1$ (red box in the top panel marks the window magnified below). For objects whose true labels are QSO_high (solid curves) or QSO_low (dashed curves), we plot the predicted class probabilities as a function of spectroscopic redshift: blue = $P(\texttt{QSO\_high})$, orange = $P(\texttt{QSO\_low})$, green = $P(\texttt{GALAXY})$ (the STAR channel is omitted for clarity). Curves show the binned median probability in redshift, smoothed for readability; shaded bands indicate a conservative dispersion of $\pm\frac{1}{3}\sigma$ (one third of the per-bin sample standard deviation) around each median. The vertical dotted line denotes the subclass split at $z=2.1$; the horizontal dotted line marks the reference level $p=0.5$. Together, the panels visualize the two main sources of ambiguity discussed in the text: low-$z$ host dilution (QSO_low vs. GALAXY) and mutual QSO_high/QSO_low confusion at the $z\simeq 2.1$ boundary.
  • ...and 2 more figures