Table of Contents
Fetching ...

Generative Image Restoration and Super-Resolution using Physics-Informed Synthetic Data for Scanning Tunneling Microscopy

Nikola L. Kolev, Tommaso Rodani, Neil J. Curson, Taylor J. Z. Stock, Alberto Cazzaniga

TL;DR

This work tackles slow STM throughput and tip-induced degradation by introducing a physics-informed synthetic data pipeline to train conditioned flow-matching and diffusion models for image restoration and super-resolution of STM images. By augmenting a small set of pristine Si(001):H images with realistic artefacts, the authors train FM and DDIM models that effectively remove common degradations and upsample images, achieving 2–4× acquisition speedups while preserving atomic-scale structure. Quantitative metrics on synthetic data and perceptual metrics on degraded real images demonstrate improvements over autoencoders, with FM models consistently delivering strong restoration fidelity and SR performance. The approach enables near-real-time processing on commodity hardware and is transferable to other scanning probe techniques, though it remains limited by multi-tip artefacts and potential hallucinations under extreme degradation.

Abstract

Scanning tunnelling microscopy (STM) enables atomic-resolution imaging and atom manipulation, but its utility is often limited by tip degradation and slow serial data acquisition. Fabrication adds another layer of complexity since the tip is often subjected to large voltages, which may alter the shape of its apex, requiring it to be conditioned. Here, we propose a machine learning (ML) approach for image repair and super-resolution to alleviate both challenges. Using a dataset of only 36 pristine experimental images of Si(001):H, we demonstrate that a physics-informed synthetic data generation pipeline can be used to train several state-of-the-art flow-matching and diffusion models. Quantitative evaluation with metrics such as the CLIP Maximum Mean Discrepancy (CMMD) score and structural similarity demonstrates that our models are able to effectively restore images and offer a two- to fourfold reduction in image acquisition time by accurately reconstructing images from sparsely sampled data. Our framework has the potential to significantly increase STM experimental throughput by offering a route to reducing the frequency of tip-conditioning procedures and to enhancing frame rates in existing high-speed STM systems.

Generative Image Restoration and Super-Resolution using Physics-Informed Synthetic Data for Scanning Tunneling Microscopy

TL;DR

This work tackles slow STM throughput and tip-induced degradation by introducing a physics-informed synthetic data pipeline to train conditioned flow-matching and diffusion models for image restoration and super-resolution of STM images. By augmenting a small set of pristine Si(001):H images with realistic artefacts, the authors train FM and DDIM models that effectively remove common degradations and upsample images, achieving 2–4× acquisition speedups while preserving atomic-scale structure. Quantitative metrics on synthetic data and perceptual metrics on degraded real images demonstrate improvements over autoencoders, with FM models consistently delivering strong restoration fidelity and SR performance. The approach enables near-real-time processing on commodity hardware and is transferable to other scanning probe techniques, though it remains limited by multi-tip artefacts and potential hallucinations under extreme degradation.

Abstract

Scanning tunnelling microscopy (STM) enables atomic-resolution imaging and atom manipulation, but its utility is often limited by tip degradation and slow serial data acquisition. Fabrication adds another layer of complexity since the tip is often subjected to large voltages, which may alter the shape of its apex, requiring it to be conditioned. Here, we propose a machine learning (ML) approach for image repair and super-resolution to alleviate both challenges. Using a dataset of only 36 pristine experimental images of Si(001):H, we demonstrate that a physics-informed synthetic data generation pipeline can be used to train several state-of-the-art flow-matching and diffusion models. Quantitative evaluation with metrics such as the CLIP Maximum Mean Discrepancy (CMMD) score and structural similarity demonstrates that our models are able to effectively restore images and offer a two- to fourfold reduction in image acquisition time by accurately reconstructing images from sparsely sampled data. Our framework has the potential to significantly increase STM experimental throughput by offering a route to reducing the frequency of tip-conditioning procedures and to enhancing frame rates in existing high-speed STM systems.

Paper Structure

This paper contains 15 sections, 24 equations, 8 figures, 15 tables.

Figures (8)

  • Figure 1: Comparison of experimental and synthetic STM degradations. Each image is 25 nm$\times$25 nm (taken at -2 V and between 20 pA and 60 pA). The left column are pristine experimental images, and the middle column are the synthetically degraded pristine images. The right column are degraded experimental images. Each row shows a specific degradation type that that we later aim to correct.
  • Figure 2: Qualitative results of the image restoration models on experimental STM data. The first and last rows are real experimental images (images taken at -2 V, 20 pA) of the same 25 nm $\times$ 25 nm area of Si(001):H. The middle rows show restored versions of the raw data (a, b) using two models: FM Large with two inference steps (c, d) and the Autoencoder (e, f). FM Large more effectively suppresses artefacts like scan line noise and preserves atomic detail, while the Autoencoder tends to introduce distortions such as blurriness or unnatural contrast.
  • Figure 3: Qualitative results of the image restoration models on experimental STM data. The first and last rows are real experimental images (images taken at -2 V, 20 pA) of the same 25 nm $\times$ 25 nm area of Si(001):H. The middle rows show restored versions of the raw data (a, b) using two models: FM Large with two inference steps (c, d) and FM Small with two inference steps (e, f). The two restorations are similar, although FM Large has done slightly better at removing the double tip artefact in (a). Neither model succeded in restoring image (h) completely.
  • Figure 4: Quantitative evaluation of image restoration models on the synthetic test set. The violin plots show the distribution of (A) Peak Signal-to-Noise Ratio (PSNR) and (B) Structural Similarity Index Measure (SSIM) for the noisy inputs, the Autoencoder baseline, the best-performing small generative model, and the best-performing large model.
  • Figure 5: Qualitative results for $\times$2 (a, c, e) and $\times$4 (b, d, f) super-resolution on experimental STM images (taken at -2 V, 30 pA) of the same 25 nm $\times$ 25 nm area of Si(001):H. For each upscaling factor, the low-resolution input (top row) is reconstructed by the FM Large model (middle row) and compared against the high-resolution ground-truth image (bottom row). The model successfully restores fine atomic details, though minor discrepancies in small defects are visible in the $\times$4 case, as shown in the insets.
  • ...and 3 more figures