OJALÁ: Optimizing J-PAS Astronomy for Large-scale Analysis. A foundation model for the SED of galaxies, QSOs and stars

G. Martínez-Solaeche; R. M. González Delgado; R. García-Benito; A. Hernán-Caballero; I. Pérez-Ràfols; L. A. Díaz-García; L. Raul Abramo; J. E. Rodríguez-Martín; A. M. Conrado; I. Breda; H. Domínguez Sánchez; I. Márquez; M. Pieri; D. López-Cano; V. M. Placco; L. Nakazono; A. del Pino; V. Marra; J. Alcaniz; N. Benitez; S. Bonoli; S. Carneiro; A. J. Cenarro; D. Cristóbal-Hornillos; S. Daflon; R. A. Dupke; A. Ederoclite; C. Hernández-Monteagudo; J. Liu; C. López-Sanjuan; A. Marín-Franch; C. Mendes de Oliveira; M. Moles; F. Roig; L. Sodré; K. Taylor; J. Varela; H. Vázquez Ramió; J. M. Vílchez; J. Zaragoza-Cardiel

OJALÁ: Optimizing J-PAS Astronomy for Large-scale Analysis. A foundation model for the SED of galaxies, QSOs and stars

G. Martínez-Solaeche, R. M. González Delgado, R. García-Benito, A. Hernán-Caballero, I. Pérez-Ràfols, L. A. Díaz-García, L. Raul Abramo, J. E. Rodríguez-Martín, A. M. Conrado, I. Breda, H. Domínguez Sánchez, I. Márquez, M. Pieri, D. López-Cano, V. M. Placco, L. Nakazono, A. del Pino, V. Marra, J. Alcaniz, N. Benitez, S. Bonoli, S. Carneiro, A. J. Cenarro, D. Cristóbal-Hornillos, S. Daflon, R. A. Dupke, A. Ederoclite, C. Hernández-Monteagudo, J. Liu, C. López-Sanjuan, A. Marín-Franch, C. Mendes de Oliveira, M. Moles, F. Roig, L. Sodré, K. Taylor, J. Varela, H. Vázquez Ramió, J. M. Vílchez, J. Zaragoza-Cardiel

Abstract

The advent of large-scale surveys requires efficient ML techniques to exploit the information of massive datasets. We present OJALA, a transformer-based autoregressive foundation model designed to simultaneously classify astronomical objects and infer their physical parameters using 54 narrow bands from J-PAS, combined with broad bands from the DESI Legacy Imaging Surveys and WISE. The model is trained on $\sim20$ million synthetic SEDs generated from DESI DR1 spectra. We validate OJALA using a cross-matched sample of $\sim121,000$ objects between J-PAS and DESI. The model achieves a weighted F1-score of approximately 0.9 for spectral classification (stars, galaxies, and QSOs) at $i < 21$. For galaxies, we recover photo-z with a precision of $σ_{\rm NMAD} < 0.01$, while for QSOs, the precision improves significantly at $z > 1.5$, reaching $σ_{\rm NMAD} \approx 0.006$ at $z \approx 3.5$. We demonstrate robust estimation of physical properties for galaxies, recovering stellar masses and SFR with a scatter of approximately 0.11 dex and 0.22 dex, respectively. Furthermore, the model accurately predicts EWs for major optical emission lines, allowing for the derivation of extinction-corrected H$α$ luminosities with a scatter of 0.29 dex. OJALA successfully reproduces the BPT and WHAN diagnostic diagrams, classifying SF, AGN, and passive galaxies with F1-scores typically ranging from 70% to 90% depending on the diagnostic class. For stars, the model reliably infers effective temperature and metallicity, though surface gravity remains challenging. Finally, we show the modularity of the architecture by fine-tuning the pre-trained embeddings to predict BH masses, a property not included in the primary training, recovering spectroscopic virial estimates with a precision of approximately 0.5 dex. We release the code, model weights, and a comprehensive VAC for the J-PAS EDR.

OJALÁ: Optimizing J-PAS Astronomy for Large-scale Analysis. A foundation model for the SED of galaxies, QSOs and stars

Abstract

million synthetic SEDs generated from DESI DR1 spectra. We validate OJALA using a cross-matched sample of

objects between J-PAS and DESI. The model achieves a weighted F1-score of approximately 0.9 for spectral classification (stars, galaxies, and QSOs) at

. For galaxies, we recover photo-z with a precision of

, while for QSOs, the precision improves significantly at

, reaching

. We demonstrate robust estimation of physical properties for galaxies, recovering stellar masses and SFR with a scatter of approximately 0.11 dex and 0.22 dex, respectively. Furthermore, the model accurately predicts EWs for major optical emission lines, allowing for the derivation of extinction-corrected H

luminosities with a scatter of 0.29 dex. OJALA successfully reproduces the BPT and WHAN diagnostic diagrams, classifying SF, AGN, and passive galaxies with F1-scores typically ranging from 70% to 90% depending on the diagnostic class. For stars, the model reliably infers effective temperature and metallicity, though surface gravity remains challenging. Finally, we show the modularity of the architecture by fine-tuning the pre-trained embeddings to predict BH masses, a property not included in the primary training, recovering spectroscopic virial estimates with a precision of approximately 0.5 dex. We release the code, model weights, and a comprehensive VAC for the J-PAS EDR.

OJALÁ: Optimizing J-PAS Astronomy for Large-scale Analysis. A foundation model for the SED of galaxies, QSOs and stars

Abstract

OJALÁ: Optimizing J-PAS Astronomy for Large-scale Analysis. A foundation model for the SED of galaxies, QSOs and stars

Abstract

Paper Structure

Table of Contents

Figures (14)