Contrast transfer functions help quantify neural network out-of-distribution generalization in HRTEM
Luis Rangel DaCosta, Mary C. Scott
TL;DR
This work tackles out-of-distribution generalization in high-resolution TEM semantic segmentation by leveraging a data-centric, information-theoretic framework. It introduces two information-transfer metrics, $\epsilon(\chi)$ and $\sigma(\chi, \chi')$, derived from the TEM contrast transfer function $T(\boldsymbol{q})$, to quantify how training and test imaging conditions differ and overlap. Through large-scale synthetic data experiments with thousands of networks trained on multislice TEM simulations, the authors show that OOD performance degrades smoothly under imaging-condition shifts and can be predicted by information-transfer relationships, guiding training-data design. The study provides a principled approach for deploying TEM-based ML workflows while highlighting limitations related to atomic-structure distribution shifts and suggesting avenues for broader applicability and domain-agnostic OOD assessment.
Abstract
Neural networks, while effective for tackling many challenging scientific tasks, are not known to perform well out-of-distribution (OOD), i.e., within domains which differ from their training data. Understanding neural network OOD generalization is paramount to their successful deployment in experimental workflows, especially when ground-truth knowledge about the experiment is hard to establish or experimental conditions significantly vary. With inherent access to ground-truth information and fine-grained control of underlying distributions, simulation-based data curation facilitates precise investigation of OOD generalization behavior. Here, we probe generalization with respect to imaging conditions of neural network segmentation models for high-resolution transmission electron microscopy (HRTEM) imaging of nanoparticles, training and measuring the OOD generalization of over 12,000 neural networks using synthetic data generated via random structure sampling and multislice simulation. Using the HRTEM contrast transfer function, we further develop a framework to compare information content of HRTEM datasets and quantify OOD domain shifts. We demonstrate that neural network segmentation models enjoy significant performance stability, but will smoothly and predictably worsen as imaging conditions shift from the training distribution. Lastly, we consider limitations of our approach in explaining other OOD shifts, such as of the atomic structures, and discuss complementary techniques for understanding generalization in such settings.
