GRATEV2.0: Computational Tools for Real-time Analysis of High-throughput High-resolution TEM (HRTEM) Images of Conjugated Polymers
Dhruv Gamdha, Ryan Fair, Adarsh Krishnamurthy, Enrique Gomez, Baskar Ganapathysubramanian
TL;DR
GRATEv2 provides an open-source, real-time HRTEM analysis framework for conjugated polymers that combines fast image-processing with Bayesian parameter optimization and a Wasserstein-distance stopping criterion to optimize data collection. The approach enables automated tuning of 13 material-specific processing parameters, robust segmentation, and scalable HPC-enabled throughput, demonstrated on the PCDTBT dataset with 4350 crystalline domains detected. By extracting features such as $d$-spacing, orientation, and crystal shape metrics, the method advances high-throughput nanoscale characterization in organic electronics. The combination of automated tuning, quantitative stopping criteria, and open-source availability supports reproducibility and broad adoption for materials discovery and optimization.
Abstract
Automated analysis of high-resolution transmission electron microscopy (HRTEM) images is increasingly essential for advancing research in organic electronics, where precise characterization of nanoscale crystal structures is crucial for optimizing material properties. This paper introduces an open-source computational framework called GRATEV2.0 (GRaph-based Analysis of TEM), designed for real-time analysis of HRTEM data, with a focus on characterizing complex microstructures in conjugated polymers, illustrated using Poly[N-9'-heptadecanyl-2,7-carbazole-alt-5,5-(4',7'-di-2-thienyl-2',1',3'-benzothiadiazole)] (PCDTBT), a key material in organic photovoltaics. GRATEV2.0 employs fast, automated image processing algorithms, enabling rapid extraction of structural features like d-spacing, orientation, and crystal shape metrics. Gaussian process optimization rapidly identifies the user-defined parameters in the approach, reducing the need for manual parameter tuning and thus enhancing reproducibility and usability. Additionally, GRATEV2.0 is compatible with high-performance computing (HPC) environments, allowing for efficient, large-scale data processing at near real-time speeds. A unique feature of GRATEV2.0 is a Wasserstein distance-based stopping criterion, which optimizes data collection by determining when further sampling no longer adds statistically significant information. This capability optimizes the amount of time the TEM facility is used while ensuring data adequacy for in-depth analysis. Open-source and tested on a substantial PCDTBT dataset, this tool offers a powerful, robust, and accessible solution for high-throughput material characterization in organic electronics.
