Table of Contents
Fetching ...

A Physics-Inspired Deep Learning Framework with Polar Coordinate Attention for Ptychographic Imaging

Han Yue, Jun Cheng, Yu-Xuan Ren, Chien-Chun Chen, Grant A. van Riessen, Philip Heng Wai Leong, Steve Feng Shu

TL;DR

This paper tackles the phase-retrieval challenge in ptychographic imaging by reorienting deep learning toward diffraction physics. It introduces PPN, a dual-branch architecture that combines Local Dependencies via ViT blocks with Non-Local Coherence via Polar Coordinate Attention (PoCA), aligning attention with reciprocal-space geometry. PoCA encodes radial-angular correlations and a learnable center, enabling superior high-frequency preservation and robust performance across low-overlap acquisitions, with substantial speedups over iterative methods and far fewer parameters than pure transformers. The approach yields notable gains in amplitude/phase reconstruction, demonstrates generalization to experimental data, and offers data-efficient learning for high-throughput, real-world diffraction imaging. Overall, PPN advances physics-informed DL for frequency-domain inverse problems and suggests broad applicability to Cryo-EM, X-ray, and astronomical imaging alike.

Abstract

Ptychographic imaging confronts inherent challenges in applying deep learning for phase retrieval from diffraction patterns. Conventional neural architectures, both convolutional neural networks and Transformer-based methods, are optimized for natural images with Euclidean spatial neighborhood-based inductive biases that exhibit geometric mismatch with the concentric coherent patterns characteristic of diffraction data in reciprocal space. In this paper, we present PPN, a physics-inspired deep learning network with Polar Coordinate Attention (PoCA) for ptychographic imaging, that aligns neural inductive biases with diffraction physics through a dual-branch architecture separating local feature extraction from non-local coherence modeling. It consists of a PoCA mechanism that replaces Euclidean spatial priors with physically consistent radial-angular correlations. PPN outperforms existing end-to-end models, with spectral and spatial analysis confirming its greater preservation of high-frequency details. Notably, PPN maintains robust performance compared to iterative methods even at low overlap ratios, making it well suited for high-throughput imaging in real-world acquisition scenarios for samples with consistent structural characteristics.

A Physics-Inspired Deep Learning Framework with Polar Coordinate Attention for Ptychographic Imaging

TL;DR

This paper tackles the phase-retrieval challenge in ptychographic imaging by reorienting deep learning toward diffraction physics. It introduces PPN, a dual-branch architecture that combines Local Dependencies via ViT blocks with Non-Local Coherence via Polar Coordinate Attention (PoCA), aligning attention with reciprocal-space geometry. PoCA encodes radial-angular correlations and a learnable center, enabling superior high-frequency preservation and robust performance across low-overlap acquisitions, with substantial speedups over iterative methods and far fewer parameters than pure transformers. The approach yields notable gains in amplitude/phase reconstruction, demonstrates generalization to experimental data, and offers data-efficient learning for high-throughput, real-world diffraction imaging. Overall, PPN advances physics-informed DL for frequency-domain inverse problems and suggests broad applicability to Cryo-EM, X-ray, and astronomical imaging alike.

Abstract

Ptychographic imaging confronts inherent challenges in applying deep learning for phase retrieval from diffraction patterns. Conventional neural architectures, both convolutional neural networks and Transformer-based methods, are optimized for natural images with Euclidean spatial neighborhood-based inductive biases that exhibit geometric mismatch with the concentric coherent patterns characteristic of diffraction data in reciprocal space. In this paper, we present PPN, a physics-inspired deep learning network with Polar Coordinate Attention (PoCA) for ptychographic imaging, that aligns neural inductive biases with diffraction physics through a dual-branch architecture separating local feature extraction from non-local coherence modeling. It consists of a PoCA mechanism that replaces Euclidean spatial priors with physically consistent radial-angular correlations. PPN outperforms existing end-to-end models, with spectral and spatial analysis confirming its greater preservation of high-frequency details. Notably, PPN maintains robust performance compared to iterative methods even at low overlap ratios, making it well suited for high-throughput imaging in real-world acquisition scenarios for samples with consistent structural characteristics.

Paper Structure

This paper contains 35 sections, 13 equations, 11 figures, 4 tables.

Figures (11)

  • Figure 1: Diffraction pattern characteristics and diffraction physics. (a): Diffraction patterns exhibit a radial distribution of information, with varying requirements for capturing low and high-frequency information. (b): 2D diffraction patterns can be viewed as projections of the intersection between the Ewald sphere and the crystal onto the detector plane. Sphere's radius is $1/\lambda$, the wavelength of the incident and diffracted beams. Importantly, the spatial adjacency preference observed in natural images' feature space also holds in the polar coordinate perspective of diffraction patterns.
  • Figure 2: The proposed PPN for ptychographic imaging. It features dual branches: a Local Dependencies Branch with standard ViT blocks, and a NonLocal Coherence Branch with Polar Coordinate Attention mechanism. The model processes logarithmically mapped diffraction patterns, combining features from both branches before decoding into individual amplitude and phase reconstructions at each position, which are then separately stitched to generate full field of view images for both amplitude and phase.
  • Figure 3: Performance comparison of single-shot experiment results and full-stitched scene retrieval using simulated data. (a) and (b) show amplitude and phase reconstructions at two representative scan positions. (c) displays the full-field amplitude and phase images stitched together from individual reconstructions at all scan positions across different models.
  • Figure 4: Frequency analysis comparison of full-scene ptychographic retrievals on simulated data. (a) 1D diagonal cross-sections of average 2D PSD from stitched simulations. Red curve: ground truth; Blue curve: models. The red box represents obvious abnormal model retrieval. (b) Quantitative breakdown of energy distribution across low, mid, and high frequency bands. Frequency ranges are defined based on the radial distance from the PSD center, with boundaries at 1/3 and 2/3 of the maximum frequency.
  • Figure 5: Performance comparison on real experimental samples. (a,b) Visual comparison of retrieved amplitude and phase (scale bar: 200nm). For a specific hook-shaped detail in ROI (region of interest), only our model effectively restores it, significantly outperforming CNN-based methods in fine structure retrieval. (c-e) Quantitative comparison of PSNR, SSIM, and MSE.
  • ...and 6 more figures