Table of Contents
Fetching ...

A Catalogue of Mid-infrared Variable Sources from unTimely

Zihan kang, Jingyi Zhang, Yanxia Zhang, Changhua Li, Xiao Kong, Minzhi Kong, Jinghang Shi, Shirui Wei, Xue-Bing Wu

TL;DR

This work addresses the lack of a comprehensive mid-infrared variability catalog by constructing the largest all-sky mid-IR variable-source sample from the unTimely coadded photometry (W1 and W2). It uses a Bayesian Gaussian mixture model with a Dirichlet process to unsupervisedly separate variable from non-variable sources, followed by outlier-detection methods and a cross-band correlation filter to mitigate artifacts. The resulting catalog comprises 8.26 million W1 variables, 7.15 million W2 variables, with 4.29 million sources variable in both bands, enabling systematic statistical studies and the discovery of rare objects such as eruptive YSOs and highly variable AGNs. This dataset provides a critical mid-infrared counterpart to optical surveys, facilitates cross-matched multi-wavelength analyses, and will guide future investigations into stellar evolution, accretion processes, and dust-enshrouded environments on both Galactic and extragalactic scales.

Abstract

The WISE and NEOWISE missions have provided the only mid-infrared all-sky time-domain data, opening a unique observational window for variability studies. Yet, a comprehensive and systematic catalog of mid-infrared variable sources has remained unavailable. In this work, we construct the first large-scale mid-infrared variability catalog based on the unTimely coadded photometry, covering tens of millions of sources. By employing a Bayesian Gaussian mixture model with a Dirichlet process, we identified 8,256,042 variable sources in the W1 band and 7,147,661 in the W2 band, significantly expanding the landscape of known mid-infrared variables. In addition to robust variability metrics, our analysis highlights rare and extreme outliers through dedicated outlier-detection algorithms, enabling the discovery of unusual classes of objects such as eruptive young stellar objects, highly variable active galactic nuclei, and other rare transients. This unprecedented dataset provides a new foundation for time-domain astronomy in the mid-infrared, offering complementary insights to optical and near-infrared surveys, and opening the door to systematic investigations of stellar evolution, accretion processes, and dust-enshrouded astrophysical environments on a Galactic and extragalactic scale.

A Catalogue of Mid-infrared Variable Sources from unTimely

TL;DR

This work addresses the lack of a comprehensive mid-infrared variability catalog by constructing the largest all-sky mid-IR variable-source sample from the unTimely coadded photometry (W1 and W2). It uses a Bayesian Gaussian mixture model with a Dirichlet process to unsupervisedly separate variable from non-variable sources, followed by outlier-detection methods and a cross-band correlation filter to mitigate artifacts. The resulting catalog comprises 8.26 million W1 variables, 7.15 million W2 variables, with 4.29 million sources variable in both bands, enabling systematic statistical studies and the discovery of rare objects such as eruptive YSOs and highly variable AGNs. This dataset provides a critical mid-infrared counterpart to optical surveys, facilitates cross-matched multi-wavelength analyses, and will guide future investigations into stellar evolution, accretion processes, and dust-enshrouded environments on both Galactic and extragalactic scales.

Abstract

The WISE and NEOWISE missions have provided the only mid-infrared all-sky time-domain data, opening a unique observational window for variability studies. Yet, a comprehensive and systematic catalog of mid-infrared variable sources has remained unavailable. In this work, we construct the first large-scale mid-infrared variability catalog based on the unTimely coadded photometry, covering tens of millions of sources. By employing a Bayesian Gaussian mixture model with a Dirichlet process, we identified 8,256,042 variable sources in the W1 band and 7,147,661 in the W2 band, significantly expanding the landscape of known mid-infrared variables. In addition to robust variability metrics, our analysis highlights rare and extreme outliers through dedicated outlier-detection algorithms, enabling the discovery of unusual classes of objects such as eruptive young stellar objects, highly variable active galactic nuclei, and other rare transients. This unprecedented dataset provides a new foundation for time-domain astronomy in the mid-infrared, offering complementary insights to optical and near-infrared surveys, and opening the door to systematic investigations of stellar evolution, accretion processes, and dust-enshrouded astrophysical environments on a Galactic and extragalactic scale.

Paper Structure

This paper contains 9 sections, 6 equations, 7 figures.

Figures (7)

  • Figure 1: The ratio of the standard deviation (SD) to the mean error ($\sigma$) of the NEOWISE epoch medians (left panel) is compared to the unTimely data (right panel), as a function of the mean magnitude. This analysis was based on 5000 randomly sampled SDSS Stripe 82 standard stars. The unTimely data clearly underestimate the photometric uncertainties, particularly for bright sources.
  • Figure 2: Distribution of cross-correlation coefficients derived from a preliminary simulation. The red segment represents the actual data. The blue segment illustrates the distribution when incorrect deblending occurs in only one band. The green segment depicts the distribution when incorrect deblending occurs in both bands at different times. The yellow segment represents the outcome of a preliminary simulation of the variable sources. Notably, the true number of non-variable sources is approximately 100 times greater than that of the variable sources.
  • Figure 3: The left panel illustrates the distribution of the cluster results derived from the algorithm applied to the 14.5-14.6 subset. The largest cluster corresponded to non-variable sources. The right panel depicts the Mahalanobis distance of the sources from the center of the largest cluster, highlighting a threshold at the 99.7th percentile.
  • Figure 4: The criteria for filtering out the two types of artifacts. The left panel displays the ratio of the standard deviation before and after the removal of the largest one or two data points, incorporating both the actual data and a rough simulation. The right panel illustrates the distribution of the Lomb-Scargle period around one year.
  • Figure 5: Two primary types of artificial variable light curves. The left panel illustrates the artifact resulting from incorrect deblending, while the right panel depicts the artifact caused by the WISE satellite instrument.
  • ...and 2 more figures