Towards Automated Petrography
Isai Daniel Chacón, Paola Ruiz Puentes, Jillian Pearse, Pablo Arbeláez
TL;DR
This work introduces LITHOS, the largest publicly available benchmark for automated petrography, combining paired PPL and XPL images with over 100k grain annotations across 25 mineral classes. It proposes a dual-encoder transformer baseline that fuses features from both polarizations, demonstrating superior mineral classification performance over single-polarization models on both binary and multi-class tasks. The dataset comprises 580 thin sections and 211,604 high-resolution patches, with detailed grain-axes annotations providing weak supervision for instance-level learning. By releasing the dataset, code, and pretrained models, the work aims to advance reproducibility and foster interdisciplinary research in automated petrographic analysis, enabling scalable mineral identification and texture characterization in geological samples.
Abstract
Petrography is a branch of geology that analyzes the mineralogical composition of rocks from microscopical thin section samples. It is essential for understanding rock properties across geology, archaeology, engineering, mineral exploration, and the oil industry. However, petrography is a labor-intensive task requiring experts to conduct detailed visual examinations of thin section samples through optical polarization microscopes, thus hampering scalability and highlighting the need for automated techniques. To address this challenge, we introduce the Large-scale Imaging and Thin section Optical-polarization Set (LITHOS), the largest and most diverse publicly available experimental framework for automated petrography. LITHOS includes 211,604 high-resolution RGB patches of polarized light and 105,802 expert-annotated grains across 25 mineral categories. Each annotation consists of the mineral class, spatial coordinates, and expert-defined major and minor axes represented as intersecting vector paths, capturing grain geometry and orientation. We evaluate multiple deep learning techniques for mineral classification in LITHOS and propose a dual-encoder transformer architecture that integrates both polarization modalities as a strong baseline for future reference. Our method consistently outperforms single-polarization models, demonstrating the value of polarization synergy in mineral classification. We have made the LITHOS Benchmark publicly available, comprising our dataset, code, and pretrained models, to foster reproducibility and further research in automated petrographic analysis.
