Table of Contents
Fetching ...

An Interpretable AI Framework to Disentangle Self-Interacting and Cold Dark Matter in Galaxy Clusters: The CKAN Approach

Zhenyang Huang, Haihao Shi, Zhiyong Liu, Na Wang

TL;DR

The paper tackles the interpretability gap in identifying SIDM from CDM in galaxy clusters by introducing CKAN, a hybrid convolutional and Kolmogorov-Arnold network with symbolic regression for interpretability. Using the BAHAMAS-SIDM simulations, CKAN achieves approximately 80% validation accuracy and identifies a SIDM cross-section threshold of $(\sigma/m)_{\mathrm{th}} \in [0.1,0.3]\,\mathrm{cm^2/g}$, consistent with theoretical expectations. CKAN also demonstrates robustness to JWST- and Euclid-like noise, supporting its applicability to upcoming surveys. The work offers a framework for obtaining physically interpretable, low-parameter models that can inform SIDM constraints in cluster observations.

Abstract

Convolutional neural networks have shown their ability to differentiate between self-interacting dark matter (SIDM) and cold dark matter (CDM) on galaxy cluster scales. However, their large parameter counts and ''black-box'' nature make it difficult to assess whether their decisions adhere to physical principles. To address this issue, we have built a Convolutional Kolmogorov-Arnold Network (CKAN) that reduces parameter count and enhances interpretability, and propose a novel analytical framework to understand the network's decision-making process. With this framework, we leverage our network to qualitatively assess the offset between the dark matter distribution center and the galaxy cluster center, as well as the size of heating regions in different models. These findings are consistent with current theoretical predictions and show the reliability and interpretability of our network. By combining network interpretability with unseen test results, we also estimate that for SIDM in galaxy clusters, the minimum cross-section $(σ/m)_{\mathrm{th}}$ required to reliably identify its collisional nature falls between $0.1\,\mathrm{cm}^2/\mathrm{g}$ and $0.3\,\mathrm{cm}^2/\mathrm{g}$. Moreover, CKAN maintains robust performance under simulated JWST and Euclid noise, highlighting its promise for application to forthcoming observational surveys.

An Interpretable AI Framework to Disentangle Self-Interacting and Cold Dark Matter in Galaxy Clusters: The CKAN Approach

TL;DR

The paper tackles the interpretability gap in identifying SIDM from CDM in galaxy clusters by introducing CKAN, a hybrid convolutional and Kolmogorov-Arnold network with symbolic regression for interpretability. Using the BAHAMAS-SIDM simulations, CKAN achieves approximately 80% validation accuracy and identifies a SIDM cross-section threshold of , consistent with theoretical expectations. CKAN also demonstrates robustness to JWST- and Euclid-like noise, supporting its applicability to upcoming surveys. The work offers a framework for obtaining physically interpretable, low-parameter models that can inform SIDM constraints in cluster observations.

Abstract

Convolutional neural networks have shown their ability to differentiate between self-interacting dark matter (SIDM) and cold dark matter (CDM) on galaxy cluster scales. However, their large parameter counts and ''black-box'' nature make it difficult to assess whether their decisions adhere to physical principles. To address this issue, we have built a Convolutional Kolmogorov-Arnold Network (CKAN) that reduces parameter count and enhances interpretability, and propose a novel analytical framework to understand the network's decision-making process. With this framework, we leverage our network to qualitatively assess the offset between the dark matter distribution center and the galaxy cluster center, as well as the size of heating regions in different models. These findings are consistent with current theoretical predictions and show the reliability and interpretability of our network. By combining network interpretability with unseen test results, we also estimate that for SIDM in galaxy clusters, the minimum cross-section required to reliably identify its collisional nature falls between and . Moreover, CKAN maintains robust performance under simulated JWST and Euclid noise, highlighting its promise for application to forthcoming observational surveys.

Paper Structure

This paper contains 5 sections, 5 equations, 3 figures.

Figures (3)

  • Figure 1: Schematic illustration of our network. To better study the features extracted by each channel, we feed the distribution maps from the three channels (total, stellar, and X-ray) into the convolutional KAN kernel and fully connected layers, which share the same configuration, and then sum the results $y_{i}$ to produce the network's final output $Y$. The network's loss function is set to cross-entropy, with its output being a three-class vector corresponding to the probabilities of each class.
  • Figure 2: Training metrics of CKAN over 80 epochs. The network shows no evidence of overfitting, either in terms of loss or accuracy.
  • Figure 3: The confusion matrix of the network on the validation set; the CDM includes CDM-low AGN, CDM fiducial AGN, and CDM-hi AGN.