Table of Contents
Fetching ...

Identification of Anomalous E+A Galaxies in GAMA Using an Isolation Forest

Kieran Broadbelt, Kevin Pimbblet, Daniel J. Farrow

TL;DR

This work addresses the challenge of discovering rare, informative outliers in large galaxy surveys by applying an unsupervised Isolation Forest to GAMA DR4, with a focus on E+A galaxies and high-S/N systems. It leverages spectroscopic (51 EW) and photometric (326 features) data, augmented by morphometric information, and uses three data partitions (spectroscopic, photometric, combined) to derive anomaly scores. The study identifies 101 bona fide anomalous galaxies (plus 2 shredded cases) encompassing extreme emission line galaxies, red spirals, AGN hosts, and star-forming E+A variants, highlighting cases not previously catalogued and flagging data-reduction issues. These findings have implications for E+A selection and galaxy evolution, suggesting physical mechanisms such as small-scale interactions and Jeans-mass limits, and demonstrate the utility of unsupervised anomaly detection for quality control and discovery in large surveys.

Abstract

We implement an outlier detection model, an Isolation Foest (iForest), to uncover anomalous objects in the Galaxy and Mass Assembly Fourth Data Release (GAMA DR4). The iForest algorithm is an unsupervise Machine Learning (ML) technique. The data used is the spectroscopic and photometric data from GAMA DR4, which compiless information for over 300000 objects. We select two samples of galaxies to isolate, high signal-to-noise galaxies, to analyse the iForest's robustness, and E+A galaxies, to study the extremes of their population. This results in six-subsamples of spectroscopic, photometric and combined data isolations, finding 101 anomalous objects, half of which have not been identified as outliers in other works. We also find a number of fringing errors and false emission lines, displaying the iForest's potential in detecting these errors. Finding anomalous E+A galaxies, that although selected in a normal manner, using low [OII] and strong Hδ absorption, are still star-forming, with strong Hα emission. We propose two solutions to why these E+A galaxies are still star-forming but also question if these galaxies can be truly classified as E+A galaxies. We suggest that small-scale interactions on the galaxies causes small star bursts. The radiative pressure when forming high mass stars form expels the accreting material quicker than it can be accreted. We also suggest that the Jeans limit in our anomalous E+A galaxies is so low that it is simply not possible to form O and B class stars, but not low enough to fully prevent star-formation.

Identification of Anomalous E+A Galaxies in GAMA Using an Isolation Forest

TL;DR

This work addresses the challenge of discovering rare, informative outliers in large galaxy surveys by applying an unsupervised Isolation Forest to GAMA DR4, with a focus on E+A galaxies and high-S/N systems. It leverages spectroscopic (51 EW) and photometric (326 features) data, augmented by morphometric information, and uses three data partitions (spectroscopic, photometric, combined) to derive anomaly scores. The study identifies 101 bona fide anomalous galaxies (plus 2 shredded cases) encompassing extreme emission line galaxies, red spirals, AGN hosts, and star-forming E+A variants, highlighting cases not previously catalogued and flagging data-reduction issues. These findings have implications for E+A selection and galaxy evolution, suggesting physical mechanisms such as small-scale interactions and Jeans-mass limits, and demonstrate the utility of unsupervised anomaly detection for quality control and discovery in large surveys.

Abstract

We implement an outlier detection model, an Isolation Foest (iForest), to uncover anomalous objects in the Galaxy and Mass Assembly Fourth Data Release (GAMA DR4). The iForest algorithm is an unsupervise Machine Learning (ML) technique. The data used is the spectroscopic and photometric data from GAMA DR4, which compiless information for over 300000 objects. We select two samples of galaxies to isolate, high signal-to-noise galaxies, to analyse the iForest's robustness, and E+A galaxies, to study the extremes of their population. This results in six-subsamples of spectroscopic, photometric and combined data isolations, finding 101 anomalous objects, half of which have not been identified as outliers in other works. We also find a number of fringing errors and false emission lines, displaying the iForest's potential in detecting these errors. Finding anomalous E+A galaxies, that although selected in a normal manner, using low [OII] and strong Hδ absorption, are still star-forming, with strong Hα emission. We propose two solutions to why these E+A galaxies are still star-forming but also question if these galaxies can be truly classified as E+A galaxies. We suggest that small-scale interactions on the galaxies causes small star bursts. The radiative pressure when forming high mass stars form expels the accreting material quicker than it can be accreted. We also suggest that the Jeans limit in our anomalous E+A galaxies is so low that it is simply not possible to form O and B class stars, but not low enough to fully prevent star-formation.

Paper Structure

This paper contains 20 sections, 13 equations, 11 figures, 2 tables.

Figures (11)

  • Figure 1: Anomaly scores of the two samples. Left: High S/N sample; consisting of $\sim10^5$ objects, the red dashed line delimits where anomalies are, below 0 indicates anomaly, above indicates nominal. Right: E+A sample; consisting of 287 objects, with the same delimiting line as the S/N plot.
  • Figure 2: A corner plot of the most common 5 morphometric measures, concentration, asymmetry, smoothness, Gini & $M_{20}$. Orange pluses are E+A spectroscopic anomalies, light blue crosses are the E+A photometric anomalies, green diamonds are S/N spectroscopic anomalies and navy blue circles are S/N photometric anomalies. We can see from this corner plot that a significant portion of the anomalous objects, have 'normal' morphologies, in that, they lie within the bulk of the other objects in GAMA. There are some extreme outliers morphologically such as in asymmetry and smoothness.
  • Figure 3: Example shredded galaxy. Left: The falsely identified galaxy that is actually a bright sub-structure in a larger galaxy, right.
  • Figure 4: Two types of 'bad spectra' found by the iForest. Top: false emission line spectra; displaying peaks in areas that are not valid emission lines. This is caused due to poor flat fielding that was an issue with early AAOmega that has since been fixed (Croom, priv. comm.). Bottom: data reduction error; this presents as a sinusoidal pattern in the spectra line.
  • Figure 5: Two types of anomalous spectra found by the iForest. Top: strong ${\rm{H\alpha}}$ E+A galaxy extracted from GAMA DR4. This galaxy has a $\rm{EW}_{H\alpha} \approx 21$Å, $\rm{EW}_{[OII]} \approx 2$Å and $\rm{EW}_{H\delta} \approx -3$Å. Bottom: EELG extracted from GAMA DR4. This galaxy has a $\rm{EW}_{[OIII]} \approx546$Å.
  • ...and 6 more figures