ForamDeepSlice: A High-Accuracy Deep Learning Framework for Foraminifera Species Classification from 2D Micro-CT Slices
Abdelghafour Halimi, Ali Alibrahim, Didier Barradas-Bautista, Ronell Sicat, Abdulkader M. Afifi
TL;DR
This work tackles automated foraminifera species classification from 2D micro-CT slices by assembling a rigorously curated dataset (97 specimens across 27 species, with 12 robust taxa) and evaluating seven pretrained CNNs under a specimen-level split to avoid data leakage. A two-phase training regime with transfer learning, strong data augmentation, and a PatchEnsemble of ConvNeXt-Large and EfficientNetV2-Small yields 95.64% test accuracy and 0.998 AUC, with 99.6% top-3 accuracy. An interactive dashboard delivers real-time slice classification and 3D slice matching using SSIM, NCC, and Dice, bridging AI with practical paleontological workflows. The paper provides a fully reproducible framework (Docker-based) and analyzes species-specific challenges, delivering benchmarks that advance AI-assisted micropaleontology and its deployment in the geosciences.
Abstract
This study presents a comprehensive deep learning pipeline for the automated classification of 12 foraminifera species using 2D micro-CT slices derived from 3D scans. We curated a scientifically rigorous dataset comprising 97 micro-CT scanned specimens across 27 species, selecting 12 species with sufficient representation for robust machine learning. To ensure methodological integrity and prevent data leakage, we employed specimen-level data splitting, resulting in 109,617 high-quality 2D slices (44,103 for training, 14,046 for validation, and 51,468 for testing). We evaluated seven state-of-the-art 2D convolutional neural network (CNN) architectures using transfer learning. Our final ensemble model, combining ConvNeXt-Large and EfficientNetV2-Small, achieved a test accuracy of 95.64%, with a top-3 accuracy of 99.6% and an area under the ROC curve (AUC) of 0.998 across all species. To facilitate practical deployment, we developed an interactive advanced dashboard that supports real-time slice classification and 3D slice matching using advanced similarity metrics, including SSIM, NCC, and the Dice coefficient. This work establishes new benchmarks for AI-assisted micropaleontological identification and provides a fully reproducible framework for foraminifera classification research, bridging the gap between deep learning and applied geosciences.
