Generative Image Restoration and Super-Resolution using Physics-Informed Synthetic Data for Scanning Tunneling Microscopy
Nikola L. Kolev, Tommaso Rodani, Neil J. Curson, Taylor J. Z. Stock, Alberto Cazzaniga
TL;DR
This work tackles slow STM throughput and tip-induced degradation by introducing a physics-informed synthetic data pipeline to train conditioned flow-matching and diffusion models for image restoration and super-resolution of STM images. By augmenting a small set of pristine Si(001):H images with realistic artefacts, the authors train FM and DDIM models that effectively remove common degradations and upsample images, achieving 2–4× acquisition speedups while preserving atomic-scale structure. Quantitative metrics on synthetic data and perceptual metrics on degraded real images demonstrate improvements over autoencoders, with FM models consistently delivering strong restoration fidelity and SR performance. The approach enables near-real-time processing on commodity hardware and is transferable to other scanning probe techniques, though it remains limited by multi-tip artefacts and potential hallucinations under extreme degradation.
Abstract
Scanning tunnelling microscopy (STM) enables atomic-resolution imaging and atom manipulation, but its utility is often limited by tip degradation and slow serial data acquisition. Fabrication adds another layer of complexity since the tip is often subjected to large voltages, which may alter the shape of its apex, requiring it to be conditioned. Here, we propose a machine learning (ML) approach for image repair and super-resolution to alleviate both challenges. Using a dataset of only 36 pristine experimental images of Si(001):H, we demonstrate that a physics-informed synthetic data generation pipeline can be used to train several state-of-the-art flow-matching and diffusion models. Quantitative evaluation with metrics such as the CLIP Maximum Mean Discrepancy (CMMD) score and structural similarity demonstrates that our models are able to effectively restore images and offer a two- to fourfold reduction in image acquisition time by accurately reconstructing images from sparsely sampled data. Our framework has the potential to significantly increase STM experimental throughput by offering a route to reducing the frequency of tip-conditioning procedures and to enhancing frame rates in existing high-speed STM systems.
