Table of Contents
Fetching ...

Predicting Infall Time of Milky-Way Satellites via Machine Learning

Seungyeon Kim, Myoungwon Jeon, Seongjun Hyung

TL;DR

The study tackles the challenge of predicting satellite infall times into Milky Way–like hosts by exploiting a physical quenching proxy, the quenching time $\tau_{90}$, together with $M_{\star}$ and $\mathrm{[Fe/H]}$, and training a LightGBM model on a large suite of A-SLOTH simulations. It shows that excluding satellites with prior group preprocessing improves MW infall predictions to a mean squared error (MSE) around $5.04$, and that focusing on the first infall for group-preprocessed satellites yields a markedly lower MSE of about $1.66$, underscoring the importance of the first infall in quenching. The approach also compares MW satellite inferences to observationally inferred infall times and extends predictions to M31 satellites, revealing trends consistent with $\tau_{90}$ and highlighting the roles of ionizing background for very low-mass systems. Overall, the work provides a fast, interpretable ML framework for inferring satellite infall histories with potential application to observed Local Group galaxies and future datasets.

Abstract

The properties of dwarf galaxies provide essential insight into galaxy formation and evolution in a hierarchical universe. Among various physical quantities, identifying their infall times to host galaxies is crucial, as these times encode key information such as star formation histories. However, estimating infall times remains challenging due to the complex interplay between different physical processes and the lack of consensus among existing methods. We propose a fast and interpretable method to predict the infall time of dwarf satellites using LightGBM, a gradient-boosting decision tree algorithm. Our model is trained on satellites from 30 Milky Way (MW)-like host galaxies generated by A-SLOTH, a semi-analytic model calibrated using observational constraints, including those from the MW and its satellites. To balance predictive ability and observational applicability, we adopt $τ_{90}$, [Fe/H], and $M_{\star}$ as input features. Since satellites with prior group membership hinder accurate MW infall predictions, we exclude them from the training data. As a result, the model achieves the best average mean squared error (MSE) of 5.04 in the A-SLOTH data set. Our model also shows good agreement with existing observational studies of MW satellites, although some discrepancies remain due to a few outliers such as CVn II and UMa I. In addition, for satellites experiencing prior infall events before MW-like host infall, the model predicts the timing of the first infall with a significantly lower MSE of 1.66, indicating the importance of the earliest infall in the quenching process of satellite galaxies.

Predicting Infall Time of Milky-Way Satellites via Machine Learning

TL;DR

The study tackles the challenge of predicting satellite infall times into Milky Way–like hosts by exploiting a physical quenching proxy, the quenching time , together with and , and training a LightGBM model on a large suite of A-SLOTH simulations. It shows that excluding satellites with prior group preprocessing improves MW infall predictions to a mean squared error (MSE) around , and that focusing on the first infall for group-preprocessed satellites yields a markedly lower MSE of about , underscoring the importance of the first infall in quenching. The approach also compares MW satellite inferences to observationally inferred infall times and extends predictions to M31 satellites, revealing trends consistent with and highlighting the roles of ionizing background for very low-mass systems. Overall, the work provides a fast, interpretable ML framework for inferring satellite infall histories with potential application to observed Local Group galaxies and future datasets.

Abstract

The properties of dwarf galaxies provide essential insight into galaxy formation and evolution in a hierarchical universe. Among various physical quantities, identifying their infall times to host galaxies is crucial, as these times encode key information such as star formation histories. However, estimating infall times remains challenging due to the complex interplay between different physical processes and the lack of consensus among existing methods. We propose a fast and interpretable method to predict the infall time of dwarf satellites using LightGBM, a gradient-boosting decision tree algorithm. Our model is trained on satellites from 30 Milky Way (MW)-like host galaxies generated by A-SLOTH, a semi-analytic model calibrated using observational constraints, including those from the MW and its satellites. To balance predictive ability and observational applicability, we adopt , [Fe/H], and as input features. Since satellites with prior group membership hinder accurate MW infall predictions, we exclude them from the training data. As a result, the model achieves the best average mean squared error (MSE) of 5.04 in the A-SLOTH data set. Our model also shows good agreement with existing observational studies of MW satellites, although some discrepancies remain due to a few outliers such as CVn II and UMa I. In addition, for satellites experiencing prior infall events before MW-like host infall, the model predicts the timing of the first infall with a significantly lower MSE of 1.66, indicating the importance of the earliest infall in the quenching process of satellite galaxies.

Paper Structure

This paper contains 21 sections, 9 figures, 7 tables.

Figures (9)

  • Figure 1: An example of the orbital history of a satellite galaxy is shown, depicting only infall into the MW-like galaxy (left) and first infall into a prior group (right) before infall into the MW-like galaxy in our A-SLOTH data. The solid lines in both panels indicate the distance from the MW-like host galaxy to the satellites, while the gray dashed line shows the virial radius of the MW-like host galaxy. The blue dash-dot line in the right panel represents the period when the satellite is within the prior group, and the red lines depict periods when the satellite is inside the virial radius of the MW-like host galaxy. The infall time of the satellites, defined as the first passage of the satellite's center to the virial radius of the MW-like host galaxy, is marked by a yellow star. Alt text: Example of a satellite’s orbital history, illustrating a direct infall event onto the MW compared to a prior group infall event before reaching the MW.
  • Figure 2: Results of the Pearson correlation coefficient for each feature indicate that the first infall time, $t_\mathrm{first}$, has the strongest correlation with quenching time. The infall time to the MW-like host shows a moderate correlation, while metallicity ($\mathrm{[Fe/H]}$) and stellar mass ($M_{\star}$) of the satellites exhibit low correlation coefficients. This suggests that using $\mathrm{[Fe/H]}$ and $M_{\star}$ alone would result in inefficient model outcomes. Alt text: Correlation analysis of each feature with the first infall time onto any halo and the infall time onto an MW like galaxy.
  • Figure 3: Flowchart for constructing the LightGBM model. Utilizing 30 DM halo merger trees, we generate 12,102 satellite galaxies with A-SLOTH. From each satellite, we extract $\tau_{90}$ [Gyr], $M_{\star}\ [M_\odot]$, and $\mathrm{[Fe/H]}$ as features to predict the infall time. We randomly sample 80% of the data for model training and optimize the model to find the best parameters, and the remaining 20% is used to test the model to evaluate the performance. Alt text: Flowchart of the LightGBM model, describing the process from data extraction for dwarf galaxies to training and testing.
  • Figure 4: The quenching timescale of the first infall time (blue circle) and the MW infall time (red cross) of satellites which had belonged to prior groups. Green circles and purple diamonds represent the median timescale and 1-$\sigma$ uncertainty for the first infall and MW infall times, respectively. Color opacity indicates different mass groups, with lower opacity corresponding to lower-mass groups. Proximity to the gray dashed line indicates a smaller interval between the infall event and quenching time. Alt text: Quenching timescales of satellites related to infall events, derived from the A-SLOTH model.
  • Figure 5: Comparison between the target infall times of satellite galaxies from A-SLOTH (considered as true values) and those predicted by our LightGBM model, shown in units of lookback Gyr as a heatmap. The left, middle, right panels correspond to low-, intermediate-, heavy-mass gorup, with the color intensity indicating the number of satellites at each point. Markers close to the dashed grey diagonal indicate a close match between predicted and target infall times. We find that the intermediate-mass group shows the best agreement, while the heavy-mass group exhibits the largest scatter. In the low-mass group, the scatter increases for satellites with infall times more recent than $\sim$ 4 Gyr ago. Alt text: Comparison of target and predicted infall times of A-SLOTH satellites across different mass groups.
  • ...and 4 more figures