Table of Contents
Fetching ...

FakET: Simulating Cryo-Electron Tomograms with Neural Style Transfer

Pavol Harar, Lukas Herrmann, Philipp Grohs, David Haselbach

TL;DR

This work tackles the scarcity of labeled cryoET data for particle localization and classification by introducing FakET, a Neural Style Transfer–based method that simulates the TEM forward operator using unlabeled reference data. FakET achieves data quality comparable to the SHREC physics-based simulator while offering a 750-fold speedup and 33-fold memory reduction, enabling large tilt-series generation without calibration. The authors validate FakET by training DeepFinder on faket-generated data, obtaining near-benchmark localization and substantially close classification performance, with further gains achievable via limited fine-tuning. The approach provides a practical, open-source solution for pre-training and evaluating neural networks in cryoET, with potential extensions to experimental data validation and domain-specific pre-training.

Abstract

In cryo-electron microscopy, accurate particle localization and classification are imperative. Recent deep learning solutions, though successful, require extensive training data sets. The protracted generation time of physics-based models, often employed to produce these data sets, limits their broad applicability. We introduce FakET, a method based on Neural Style Transfer, capable of simulating the forward operator of any cryo transmission electron microscope. It can be used to adapt a synthetic training data set according to reference data producing high-quality simulated micrographs or tilt-series. To assess the quality of our generated data, we used it to train a state-of-the-art localization and classification architecture and compared its performance with a counterpart trained on benchmark data. Remarkably, our technique matches the performance, boosts data generation speed 750 times, uses 33 times less memory, and scales well to typical transmission electron microscope detector sizes. It leverages GPU acceleration and parallel processing. The source code is available at https://github.com/paloha/faket.

FakET: Simulating Cryo-Electron Tomograms with Neural Style Transfer

TL;DR

This work tackles the scarcity of labeled cryoET data for particle localization and classification by introducing FakET, a Neural Style Transfer–based method that simulates the TEM forward operator using unlabeled reference data. FakET achieves data quality comparable to the SHREC physics-based simulator while offering a 750-fold speedup and 33-fold memory reduction, enabling large tilt-series generation without calibration. The authors validate FakET by training DeepFinder on faket-generated data, obtaining near-benchmark localization and substantially close classification performance, with further gains achievable via limited fine-tuning. The approach provides a practical, open-source solution for pre-training and evaluating neural networks in cryoET, with potential extensions to experimental data validation and domain-specific pre-training.

Abstract

In cryo-electron microscopy, accurate particle localization and classification are imperative. Recent deep learning solutions, though successful, require extensive training data sets. The protracted generation time of physics-based models, often employed to produce these data sets, limits their broad applicability. We introduce FakET, a method based on Neural Style Transfer, capable of simulating the forward operator of any cryo transmission electron microscope. It can be used to adapt a synthetic training data set according to reference data producing high-quality simulated micrographs or tilt-series. To assess the quality of our generated data, we used it to train a state-of-the-art localization and classification architecture and compared its performance with a counterpart trained on benchmark data. Remarkably, our technique matches the performance, boosts data generation speed 750 times, uses 33 times less memory, and scales well to typical transmission electron microscope detector sizes. It leverages GPU acceleration and parallel processing. The source code is available at https://github.com/paloha/faket.
Paper Structure (27 sections, 19 figures, 3 tables)

This paper contains 27 sections, 19 figures, 3 tables.

Figures (19)

  • Figure 1: Simulated projection taken from SHREC 2021 data set. Axes x and y correspond to width and height of the imaged grandmodel. Colorbar denotes simulated intensities in arbitrary units. See \ref{['appendix:sidebyside']} for side-by-side comparison with other projections.
  • Figure 2: Noiseless projection used to create the input to our proposed method. Axes x and y correspond to width and height of the imaged grandmodel. Colorbar denotes intensities measured using Radon transform and negated such that particles have lower intensities than the background, as it is in the case of TEM which measures attenuation of electron beams. The particles are not embedded in any solvent (as if they were in vaccuum instead of being embedded in ice), therefore the background appears much brighter than in the simulated projections. See \ref{['appendix:sidebyside']} for side-by-side comparison with other projections.
  • Figure 3: baseline projection created by adding Gaussian noise to the noiseless projection. Axes x and y correspond to width and height of the imaged grandmodel. Colorbar denotes simulated intensities in arbitrary units. Please note that this projection is not exactly the same as in \ref{['fig:projshrec']} or in \ref{['fig:projfaket']}, see also the explanation in the caption of \ref{['fig:projfaket']} and \ref{['appendix:sidebyside']} for a side-by-side comparison with other projections.
  • Figure 4: faket projection output by our method. Axes x and y correspond to the grandmodel's width and height respectively, and colorbar indicates simulated intensities in arbitrary units. Though visually similar to \ref{['fig:projbaseline']}, there is a subtle difference that has a significant impact on DF's performance. This similarity complicates comparison of projections from various simulators using currently available image metrics. See \ref{['appendix:sidebyside']} for a side-by-side comparison with other projections.
  • Figure 5: Diagram of steps to simulate the benchmark, baseline, faket, and noiseless projections and reconstructions. Red arrows highlight steps to reproduce SHREC data from which we use the last tomogram for testing. All methods except SHREC were filtered using a reverse-engineered filter (see \ref{['appendix:benchmark']}) because the SHREC filtering step is under-documented. The style projections never feature the same contents as the simulated ones (see \ref{['appendix:sidebyside']}). Grandmodels, noiseless artificial samples containing randomly scattered particles, were created using existing models of biological macromolecular structures, represented as Coulomb density volumes.
  • ...and 14 more figures