Table of Contents
Fetching ...

Astrometric Binary Classification Via Artificial Neural Networks

Joe Smith

TL;DR

Distinguishing true astrometric binaries from chance alignments in Gaia DR3 at Gaia-scale data volume requires automated, scalable methods. The authors train a dense artificial neural network on 1.5 million labeled binaries using six features derived from component proper motions, parallaxes, and separations, achieving ROC-AUC ≈ 0.999 and accuracy ≈ 0.993 on the test set. Misclassifications cluster near the decision boundary, but overall performance is extremely high, with rapid convergence and minimal overfitting. The results demonstrate a fast, automatic alternative to traditional binary catalogs and point to future improvements with Gaia DR4 radial-velocity data and active learning to further enhance boundary handling.

Abstract

With nearly two billion stars observed and their corresponding astrometric parameters evaluated in the recent Gaia mission, the number of astrometric binary candidates have risen significantly. Due to the surplus of astrometric data, the current computational methods employed to inspect these astrometric binary candidates are both computationally expensive and cannot be executed in a reasonable time frame. In light of this, a machine learning (ML) technique to automatically classify whether a set of stars belong to an astrometric binary pair via an artificial neural network (ANN) is proposed. Using data from Gaia DR3, the ANN was trained and tested on 1.5 million highly probable true and visual binaries, considering the proper motions, parallaxes, and angular and physical separations as features. The ANN achieves high classification scores, with an accuracy of 99.3%, a precision rate of 0.988, a recall rate of 0.991, and an AUC of 0.999, indicating that the utilized ML technique is a highly effective method for classifying astrometric binaries. Thus, the proposed ANN is a promising alternative to the existing methods for the classification of astrometric binaries.

Astrometric Binary Classification Via Artificial Neural Networks

TL;DR

Distinguishing true astrometric binaries from chance alignments in Gaia DR3 at Gaia-scale data volume requires automated, scalable methods. The authors train a dense artificial neural network on 1.5 million labeled binaries using six features derived from component proper motions, parallaxes, and separations, achieving ROC-AUC ≈ 0.999 and accuracy ≈ 0.993 on the test set. Misclassifications cluster near the decision boundary, but overall performance is extremely high, with rapid convergence and minimal overfitting. The results demonstrate a fast, automatic alternative to traditional binary catalogs and point to future improvements with Gaia DR4 radial-velocity data and active learning to further enhance boundary handling.

Abstract

With nearly two billion stars observed and their corresponding astrometric parameters evaluated in the recent Gaia mission, the number of astrometric binary candidates have risen significantly. Due to the surplus of astrometric data, the current computational methods employed to inspect these astrometric binary candidates are both computationally expensive and cannot be executed in a reasonable time frame. In light of this, a machine learning (ML) technique to automatically classify whether a set of stars belong to an astrometric binary pair via an artificial neural network (ANN) is proposed. Using data from Gaia DR3, the ANN was trained and tested on 1.5 million highly probable true and visual binaries, considering the proper motions, parallaxes, and angular and physical separations as features. The ANN achieves high classification scores, with an accuracy of 99.3%, a precision rate of 0.988, a recall rate of 0.991, and an AUC of 0.999, indicating that the utilized ML technique is a highly effective method for classifying astrometric binaries. Thus, the proposed ANN is a promising alternative to the existing methods for the classification of astrometric binaries.
Paper Structure (16 sections, 2 equations, 7 figures, 2 tables)

This paper contains 16 sections, 2 equations, 7 figures, 2 tables.

Figures (7)

  • Figure 1: Histogram of true and visual binaries in the sorted catalogue as a function of their physical separation. The range of values fall between 0.70 $<$ log(s au$^{-1}$) $<$ 5.31, obeying the initial condition of s$<$ 206265 au.
  • Figure 2: Physical separations plotted against their corresponding angular separations. Overplotted for true (blue) and visual (orange) binaries are 2D density histograms.
  • Figure 3: The architecture of the proposed ANN, where the inputs are the six features associated with the input binary candidate (Section \ref{['subsec:feat']}). See Table \ref{['tab2']} for the model architecture of the ANN in detail.
  • Figure 4: The first row presents the performance of the ANN considering the accuracy (left) and loss (right) evolutions during the training and validation phases. The second row presents the computed ROC curve (left) and confusion matrix (right) during the test phase.
  • Figure 5: Histogram of misclassified and correctly classified binaries from the test set as a function of their physical separation. The range of physical separations for misclassified binaries fall between 2.95 $<$ log(s au$^{-1}$) $<$ 5.30, with values falling between 4.10 $<$ log(s au$^{-1}$) $<$ 5.18 within ±$3\sigma$ of the mean.
  • ...and 2 more figures