Astroconformer: The Prospects of Analyzing Stellar Light Curves with Transformer-Based Deep Learning Models
Jia-Shu Pan, Yuan-Sen Ting, Jie Yu
TL;DR
Astroconformer introduces a Transformer-based framework that fuses self-attention with convolution to analyze stellar light curves in the time domain, capturing long-range correlations and phase information that are overlooked by power spectra. The model uses patch embeddings and Rotary Positional Encoding to encode 90-day Kepler light curves into a sequence processed by an 8-head MHSA encoder with convolutional modules, ultimately predicting $\log g$ from the full time series. It achieves state-of-the-art performance, with RMSE as low as $0.017$ dex near $\log g\approx3$ and robust $\nu_{\max}$ estimates (relative median error $<2\%$) on short segments, outperforming both k-NN and CNN baselines and competing with traditional asteroseismic pipelines on limited data. Attention maps provide interpretability, revealing sensitivity to both oscillations and granulation, indicating that Astroconformer leverages non-Gaussian phase information and long-timescale stellar signals. The work demonstrates the potential of Transformer-based architectures for scalable, high-precision asteroseismology in upcoming surveys with varying cadences and observation windows.
Abstract
Stellar light curves contain valuable information about oscillations and granulation, offering insights into stars' internal structures and evolutionary states. Traditional asteroseismic techniques, primarily focused on power spectral analysis, often overlook the crucial phase information in these light curves. Addressing this gap, recent machine learning applications, particularly those using Convolutional Neural Networks (CNNs), have made strides in inferring stellar properties from light curves. However, CNNs are limited by their localized feature extraction capabilities. In response, we introduce $\textit{Astroconformer}$, a Transformer-based deep learning framework, specifically designed to capture long-range dependencies in stellar light curves. Our empirical analysis centers on estimating surface gravity ($\log g$), using a dataset derived from single-quarter Kepler light curves with $\log g$ values ranging from 0.2 to 4.4. $\textit{Astroconformer}$ demonstrates superior performance, achieving a root-mean-square-error (RMSE) of 0.017 dex at $\log g\approx3$ in data-rich regimes and up to 0.1 dex in sparser areas. This performance surpasses both K-nearest neighbor models and advanced CNNs. Ablation studies highlight the influence of receptive field size on model effectiveness, with larger fields correlating to improved results. $\textit{Astroconformer}$ also excels in extracting $ν_{\max}$ with high precision. It achieves less than 2% relative median absolute error for 90-day red giant light curves. Notably, the error remains under 3% for 30-day light curves, whose oscillations are undetectable by a conventional pipeline in 30% cases. Furthermore, the attention mechanisms in $\textit{Astroconformer}$ align closely with the characteristics of stellar oscillations and granulation observed in light curves.
