Table of Contents
Fetching ...

ForamDeepSlice: A High-Accuracy Deep Learning Framework for Foraminifera Species Classification from 2D Micro-CT Slices

Abdelghafour Halimi, Ali Alibrahim, Didier Barradas-Bautista, Ronell Sicat, Abdulkader M. Afifi

TL;DR

This work tackles automated foraminifera species classification from 2D micro-CT slices by assembling a rigorously curated dataset (97 specimens across 27 species, with 12 robust taxa) and evaluating seven pretrained CNNs under a specimen-level split to avoid data leakage. A two-phase training regime with transfer learning, strong data augmentation, and a PatchEnsemble of ConvNeXt-Large and EfficientNetV2-Small yields 95.64% test accuracy and 0.998 AUC, with 99.6% top-3 accuracy. An interactive dashboard delivers real-time slice classification and 3D slice matching using SSIM, NCC, and Dice, bridging AI with practical paleontological workflows. The paper provides a fully reproducible framework (Docker-based) and analyzes species-specific challenges, delivering benchmarks that advance AI-assisted micropaleontology and its deployment in the geosciences.

Abstract

This study presents a comprehensive deep learning pipeline for the automated classification of 12 foraminifera species using 2D micro-CT slices derived from 3D scans. We curated a scientifically rigorous dataset comprising 97 micro-CT scanned specimens across 27 species, selecting 12 species with sufficient representation for robust machine learning. To ensure methodological integrity and prevent data leakage, we employed specimen-level data splitting, resulting in 109,617 high-quality 2D slices (44,103 for training, 14,046 for validation, and 51,468 for testing). We evaluated seven state-of-the-art 2D convolutional neural network (CNN) architectures using transfer learning. Our final ensemble model, combining ConvNeXt-Large and EfficientNetV2-Small, achieved a test accuracy of 95.64%, with a top-3 accuracy of 99.6% and an area under the ROC curve (AUC) of 0.998 across all species. To facilitate practical deployment, we developed an interactive advanced dashboard that supports real-time slice classification and 3D slice matching using advanced similarity metrics, including SSIM, NCC, and the Dice coefficient. This work establishes new benchmarks for AI-assisted micropaleontological identification and provides a fully reproducible framework for foraminifera classification research, bridging the gap between deep learning and applied geosciences.

ForamDeepSlice: A High-Accuracy Deep Learning Framework for Foraminifera Species Classification from 2D Micro-CT Slices

TL;DR

This work tackles automated foraminifera species classification from 2D micro-CT slices by assembling a rigorously curated dataset (97 specimens across 27 species, with 12 robust taxa) and evaluating seven pretrained CNNs under a specimen-level split to avoid data leakage. A two-phase training regime with transfer learning, strong data augmentation, and a PatchEnsemble of ConvNeXt-Large and EfficientNetV2-Small yields 95.64% test accuracy and 0.998 AUC, with 99.6% top-3 accuracy. An interactive dashboard delivers real-time slice classification and 3D slice matching using SSIM, NCC, and Dice, bridging AI with practical paleontological workflows. The paper provides a fully reproducible framework (Docker-based) and analyzes species-specific challenges, delivering benchmarks that advance AI-assisted micropaleontology and its deployment in the geosciences.

Abstract

This study presents a comprehensive deep learning pipeline for the automated classification of 12 foraminifera species using 2D micro-CT slices derived from 3D scans. We curated a scientifically rigorous dataset comprising 97 micro-CT scanned specimens across 27 species, selecting 12 species with sufficient representation for robust machine learning. To ensure methodological integrity and prevent data leakage, we employed specimen-level data splitting, resulting in 109,617 high-quality 2D slices (44,103 for training, 14,046 for validation, and 51,468 for testing). We evaluated seven state-of-the-art 2D convolutional neural network (CNN) architectures using transfer learning. Our final ensemble model, combining ConvNeXt-Large and EfficientNetV2-Small, achieved a test accuracy of 95.64%, with a top-3 accuracy of 99.6% and an area under the ROC curve (AUC) of 0.998 across all species. To facilitate practical deployment, we developed an interactive advanced dashboard that supports real-time slice classification and 3D slice matching using advanced similarity metrics, including SSIM, NCC, and the Dice coefficient. This work establishes new benchmarks for AI-assisted micropaleontological identification and provides a fully reproducible framework for foraminifera classification research, bridging the gap between deep learning and applied geosciences.

Paper Structure

This paper contains 21 sections, 11 figures, 8 tables.

Figures (11)

  • Figure 1: Deep learning pipeline for automated foraminifera classification.We curated 97 micro-CT scanned specimens across 27 species, selecting 12 for robust training. Using specimen-level data splitting, 109,617 2D slices were processed through seven CNN architectures. Our PatchEnsemble model achieved 95.64% accuracy and AUC = 0.998. An interactive dashboard enables real-time slice classification and 3D matching, setting new benchmarks for AI-assisted micropaleontology.
  • Figure 2: 3D volume rendering and sample 2D slice from micro-CT data for each species used in the study.We used Avizo software for visualization and conversion of micro-CT data to NIfTI files used for deep learning workflows.
  • Figure 3: Performance comparison of seven state-of-the-art 2D CNN architectures.Each model was evaluated on a balanced dataset of 109,617 micro-CT slices using standard classification metrics. Accuracy, F1-score, and AUC are reported for each architecture. ConvNeXt-Large achieved the highest accuracy, while both ConvNeXt-Large and NASNet demonstrated superior F1-scores. All models exhibited excellent calibration, with AUC values exceeding 0.98.
  • Figure 4: Species-level classification metrics across models.F1-score, precision, and recall were computed for each species, revealing strong overall performance but notable weaknesses for Baculogypsina and Orbitoides. These species exhibited distinct error patterns—high false positives for Baculogypsina and high false negatives for Orbitoides—resulting in reduced F1-scores.
  • Figure 5: PatchEnsemble improves classification of difficult species. Compared to conventional ensemble methods, the PatchEnsemble strategy significantly boosts precision and recall for Baculogypsina and Orbitoides, which are otherwise prone to misclassification.
  • ...and 6 more figures