TIDMAD: Time Series Dataset for Discovering Dark Matter with AI Denoising
J. T. Fry, Xinyi Hope Fu, Zhenghao Fu, Kaliroe M. W. Pappas, Lindley Winslow, Aobo Li
TL;DR
TIDMAD releases an ultra-long ABRACADABRA time-series dataset with ground-truth signal injections to train and benchmark AI-based denoising for axion dark matter searches. It pairs training/validation data with a science set and provides a physics-focused benchmark (the Brazil-band TS limit) to translate denoising performance into improved dark matter limits, using $f_a = m_a/2\pi$ and a ground-truth signal model $J_{\mathrm{eff}} = g_{a\gamma\gamma} \sqrt{2 \rho_{\mathrm{DM}}} \mathbf{B}_0 \cos(m_a t)$. Eight denoising methods are evaluated, with deep-learning architectures substantially outperforming traditional filters, culminating in a best-performing FC Net that enhances sensitivity by 1–2 orders of magnitude across $m_a$ relative to the raw data. While hardware constraints prevent surpassing ABRA-10cm Run 3 limits in this dataset, the approach demonstrates a clear path to broader discovery potential and provides a foundation for generalizing time-series denoising to other wave-like dark matter searches and physics experiments. The release thus accelerates collaboration between ML researchers and the particle-astrophysics community, enabling rapid translation from denoising proficiency to publishable physics results.
Abstract
Dark matter makes up approximately 85% of total matter in our universe, yet it has never been directly observed in any laboratory on Earth. The origin of dark matter is one of the most important questions in contemporary physics, and a convincing detection of dark matter would be a Nobel-Prize-level breakthrough in fundamental science. The ABRACADABRA experiment was specifically designed to search for dark matter. Although it has not yet made a discovery, ABRACADABRA has produced several dark matter search results widely endorsed by the physics community. The experiment generates ultra-long time-series data at a rate of 10 million samples per second, where the dark matter signal would manifest itself as a sinusoidal oscillation mode within the ultra-long time series. In this paper, we present the TIDMAD -- a comprehensive data release from the ABRACADABRA experiment including three key components: an ultra-long time series dataset divided into training, validation, and science subsets; a carefully-designed denoising score for direct model benchmarking; and a complete analysis framework which produces a community-standard dark matter search result suitable for publication as a physics paper. This data release enables core AI algorithms to extract the dark matter signal and produce real physics results thereby advancing fundamental science. The data downloading and associated analysis scripts are available at https://github.com/jessicafry/TIDMAD
