Table of Contents
Fetching ...

Self-Supervised Z-Slice Augmentation for 3D Bio-Imaging via Knowledge Distillation

Alessandro Pasqui, Sajjad Mahdavi, Benoit Vianay, Alexandra Colin, Alex McDougall, Rémi Dumollard, Yekaterina A. Miroshnikova, Elsa Labrune, Hervé Turlier

TL;DR

ZAugNet tackles the axial-resolution bottleneck in 3D bio-imaging by applying self-supervised learning to interpolate between consecutive z-slices with a GAN framework. A Teacher–Student knowledge-distillation scheme keeps the interpolator lightweight while preserving high fidelity, and ZAugNet+ extends this to continuous interpolation via a Digital Propagation Matrix. Iterative application doubles axial resolution per step, yielding stacks with up to $2n - 1$ slices while maintaining structural integrity across diverse biological datasets. The approach outperforms traditional bicubic interpolation and CAFI-based methods in both accuracy and speed and is released as open-source with Colab access to facilitate broad adoption in the life-sciences community.

Abstract

Three-dimensional biological microscopy has significantly advanced our understanding of complex biological structures. However, limitations due to microscopy techniques, sample properties or phototoxicity often result in poor z-resolution, hindering accurate cellular measurements. Here, we introduce ZAugNet, a fast, accurate, and self-supervised deep learning method for enhancing z-resolution in biological images. By performing nonlinear interpolation between consecutive slices, ZAugNet effectively doubles resolution with each iteration. Compared on several microscopy modalities and biological objects, it outperforms competing methods on most metrics. Our method leverages a generative adversarial network (GAN) architecture combined with knowledge distillation to maximize prediction speed without compromising accuracy. We also developed ZAugNet+, an extended version enabling continuous interpolation at arbitrary distances, making it particularly useful for datasets with nonuniform slice spacing. Both ZAugNet and ZAugNet+ provide high-performance, scalable z-slice augmentation solutions for large-scale 3D imaging. They are available as open-source frameworks in PyTorch, with an intuitive Colab notebook interface for easy access by the scientific community.

Self-Supervised Z-Slice Augmentation for 3D Bio-Imaging via Knowledge Distillation

TL;DR

ZAugNet tackles the axial-resolution bottleneck in 3D bio-imaging by applying self-supervised learning to interpolate between consecutive z-slices with a GAN framework. A Teacher–Student knowledge-distillation scheme keeps the interpolator lightweight while preserving high fidelity, and ZAugNet+ extends this to continuous interpolation via a Digital Propagation Matrix. Iterative application doubles axial resolution per step, yielding stacks with up to slices while maintaining structural integrity across diverse biological datasets. The approach outperforms traditional bicubic interpolation and CAFI-based methods in both accuracy and speed and is released as open-source with Colab access to facilitate broad adoption in the life-sciences community.

Abstract

Three-dimensional biological microscopy has significantly advanced our understanding of complex biological structures. However, limitations due to microscopy techniques, sample properties or phototoxicity often result in poor z-resolution, hindering accurate cellular measurements. Here, we introduce ZAugNet, a fast, accurate, and self-supervised deep learning method for enhancing z-resolution in biological images. By performing nonlinear interpolation between consecutive slices, ZAugNet effectively doubles resolution with each iteration. Compared on several microscopy modalities and biological objects, it outperforms competing methods on most metrics. Our method leverages a generative adversarial network (GAN) architecture combined with knowledge distillation to maximize prediction speed without compromising accuracy. We also developed ZAugNet+, an extended version enabling continuous interpolation at arbitrary distances, making it particularly useful for datasets with nonuniform slice spacing. Both ZAugNet and ZAugNet+ provide high-performance, scalable z-slice augmentation solutions for large-scale 3D imaging. They are available as open-source frameworks in PyTorch, with an intuitive Colab notebook interface for easy access by the scientific community.

Paper Structure

This paper contains 32 sections, 9 equations, 5 figures.

Figures (5)

  • Figure 1: ZAugNet principle.a) Graphical representation of ZAugNet’s prediction process, demonstrating its ability to double the axial resolution of a 3D image (here a 16-cell Phallusia Mammillata ascidian embryo). b) Self-supervised training scheme of ZAugNet: on the left, selected slices are removed from the stack to serve as ground truth. On the right, ZAugNet predicts the missing slices, optimizing its parameters by minimizing the error between the predicted slices and the original focal planes prior to their removal.
  • Figure 2: Neural network architecture and evaluation.a) ZAugNet’s architecture: it follows a GAN framework, where the Generator and Discriminator interact to improve the accuracy and reliability of generated biological images. Through a knowledge distillation process, the Teacher network transfers information to the Generator (Student), enabling a more efficient and lightweight design for predictions. ZAugNet+ extends this approach by allowing continuous interpolation between two slices, with users specifying the relative position of the interpolated slice using a Digital Propagation Matrix (DPM) that defines the relative distance $z_i$ to the target plane. b) Direct vs. Adversarial Training in ZAugNet: visual comparison of interpolation results using Generator-only (direct training) versus adversarial training with both the Generator and Discriminator, shown alongside the ground truth (16-cell Phallusia Mammillata ascidian embryo). c) Quantitative evaluation using Fréchet Inception Distance (FID), where lower values indicate greater similarity to the ground truth. Additional metrics, including RMSE, PSNR, and SSIM, assess the inter-stack average error to further compare the effectiveness of both training approaches.
  • Figure 3: Comparison of ZAugNet, CAFI, and Bicubic Interpolation Methods. Each method was applied to four different datasets to achieve the same high-resolution as the available ground truth stacks, resulting in an 8-fold axial resolution increase for ascidian embryos and a 4-fold increase for cell nuclei, filaments of microtubules, and human embryos. The inter-stack average error was quantified using RMSE, PSNR, and SSIM metrics. On the left, the first column presents ground truth slices, while the second column shows the corresponding ZAugNet-predicted slices obtained from the first interpolation step on the low-resolution stacks (2-fold increase). These slices represent the cases where ZAugNet’s performance is, on average, the lowest across all evaluated metrics.
  • Figure 4: Benchmarking on microscopy image analysis tasks.a) Cell volume conservation in ZAugNet-augmented 16-cell stage ascidian embryo images. A low-resolution image stack (18 slices) is axially augmented using ZAugNet, applied iteratively three times to achieve an 8-fold resolution increase (137 slices). Both the ZAugNet-augmented stack and the original high-resolution ground truth are segmented using the 3D Cellpose pipeline. Segmentation masks are aligned, and each cell in the ZAugNet-augmented mask is matched to its corresponding cell in the ground truth mask based on maximum pixel overlap. The plot (right) quantifies cell volume conservation (pixel counts) between the ZAugNet-augmented and ground truth segmentations, demonstrating the accuracy of ZAugNet in preserving biological structures. b) Roughness quantification in ZAugNet-augmented cell nuclei images. Four low-resolution image stacks of cell nuclei, each containing a different number of slices but acquired with the same axial resolution, undergo two iterative applications of ZAugNet, resulting in a 4-fold axial resolution increase per stack. Both the ZAugNet-augmented stacks and the original high-resolution ground truth stacks are segmented using a custom pipeline combining 2D and 3D Cellpose. For each nucleus in the segmentation masks, a surface point cloud is extracted by identifying the external pixels of each segmented structure. The point cloud is fitted using spherical harmonics, and their expansion coefficients are used to compute nuclear roughness measurements. The bar plot (right) presents the mean roughness values and their relative standard deviations for both the ground truth and ZAugNet-augmented datasets. c) Lengths conservation in ZAugNet-augmented microtubule filament images. (Left) Maximum projection of a low-resolution microtubule filaments image (20 slices) and (Center) corresponding higher-resolution image (77 slices), augmented through two successive applications of ZAugNet to achieve a 4-fold resolution increase and segmented into filamentous structures using SOAX 3D. (Right) 36 images obtained after z-resolution augmentation were analyzed using the 3D SOAX pipeline, and results were compared to the ground truth of same resolution. A box plot compares the cumulative filament length between predicted and ground truth images, and the number of detected filaments in indicated at the top. A visual coomparison of the IoU score of 0.79 shows that ZAugNet accurately predicts microtubule structure and spatial continuity.
  • Figure 5: Continuous z-interpolation at arbitrary distances with ZAugNet+.a) Example of a ZAugNet+ predicted slice at a relative distance of $z_i = 0.25$, shown alongside the corresponding ground truth image for visual comparison. b) Bar plot showing Fréchet Inception Distance (FID) values for two consecutive applications of ZAugNet+, achieving up to a 4-fold increase in axial resolution on the human embryos dataset. The red bar represents FID results reported by the authors of the Super-Focus model, being trained on a significantly larger dataset (263,024 focal stacks compared to ZAugNet’s 4,165 focal stacks). The last blue bar reflects the FID value computed by comparing the sub-resolved ground truth (9 slices) with the ZAugNet+ predicted images (17 slices), demonstrating superior interpolation quality.