The Scaling Law in Stellar Light Curves
Jia-Shu Pan, Yuan-Sen Ting, Yang Huang, Jie Yu, Ji-Feng Liu
TL;DR
The paper investigates whether scaling laws from other domains apply to astronomical time series by training GPT-2–style autoregressive transformers in a self-supervised fashion on Kepler stellar light curves. It demonstrates that pretraining and downstream performance improve with model size up to $1.5 imes10^9$ parameters, without a visible plateau, and that latent representations enable log $g$ inference with 3–10× greater sample efficiency than a supervised state-of-the-art method. The approach uses a simple GPT-2 framework with MLP-based tokenization and Huber loss for next-token regression, achieving strong scaling even with a modest 0.7B pretraining tokens. These findings suggest that large-scale autoregressive models can serve as robust foundational representations for astronomical time series, offering a scalable path to analyze data from upcoming surveys like Rubin Observatory, LSST, and SiTian.
Abstract
Analyzing time series of fluxes from stars, known as stellar light curves, can reveal valuable information about stellar properties. However, most current methods rely on extracting summary statistics, and studies using deep learning have been limited to supervised approaches. In this research, we investigate the scaling law properties that emerge when learning from astronomical time series data using self-supervised techniques. By employing the GPT-2 architecture, we show the learned representation improves as the number of parameters increases from $10^4$ to $10^9$, with no signs of performance plateauing. We demonstrate that a self-supervised Transformer model achieves 3-10 times the sample efficiency compared to the state-of-the-art supervised learning model when inferring the surface gravity of stars as a downstream task. Our research lays the groundwork for analyzing stellar light curves by examining them through large-scale auto-regressive generative models.
