Table of Contents
Fetching ...

CWT-Net: Super-resolution of Histopathology Images Using a Cross-scale Wavelet-based Transformer

Feiyang Jia, Zhineng Chen, Ziying Song, Lin Liu, Caiyan Jia

TL;DR

CWT-Net tackles the challenge of preserving multi-scale structural information in histopathology image super-resolution by coupling a dedicated SR branch with a Wavelet Transform branch that extracts cross-scale high-frequency features. A Transformer module enables cross-scale fusion, guided by a Wavelet Reconstruction (WR) block so WT information can be leveraged during training and testing, and a new MLCamSR dataset provides cross-scale, undegraded information for robust learning. Empirical results show state-of-the-art PSNR/SSIM gains and qualitative improvements in high-frequency detail, with demonstrated benefits to downstream diagnostic classification. This framework offers a practical path for pathology image enhancement and potential pre-training priors for related medical imaging tasks.

Abstract

Super-resolution (SR) aims to enhance the quality of low-resolution images and has been widely applied in medical imaging. We found that the design principles of most existing methods are influenced by SR tasks based on real-world images and do not take into account the significance of the multi-level structure in pathological images, even if they can achieve respectable objective metric evaluations. In this work, we delve into two super-resolution working paradigms and propose a novel network called CWT-Net, which leverages cross-scale image wavelet transform and Transformer architecture. Our network consists of two branches: one dedicated to learning super-resolution and the other to high-frequency wavelet features. To generate high-resolution histopathology images, the Transformer module shares and fuses features from both branches at various stages. Notably, we have designed a specialized wavelet reconstruction module to effectively enhance the wavelet domain features and enable the network to operate in different modes, allowing for the introduction of additional relevant information from cross-scale images. Our experimental results demonstrate that our model significantly outperforms state-of-the-art methods in both performance and visualization evaluations and can substantially boost the accuracy of image diagnostic networks.

CWT-Net: Super-resolution of Histopathology Images Using a Cross-scale Wavelet-based Transformer

TL;DR

CWT-Net tackles the challenge of preserving multi-scale structural information in histopathology image super-resolution by coupling a dedicated SR branch with a Wavelet Transform branch that extracts cross-scale high-frequency features. A Transformer module enables cross-scale fusion, guided by a Wavelet Reconstruction (WR) block so WT information can be leveraged during training and testing, and a new MLCamSR dataset provides cross-scale, undegraded information for robust learning. Empirical results show state-of-the-art PSNR/SSIM gains and qualitative improvements in high-frequency detail, with demonstrated benefits to downstream diagnostic classification. This framework offers a practical path for pathology image enhancement and potential pre-training priors for related medical imaging tasks.

Abstract

Super-resolution (SR) aims to enhance the quality of low-resolution images and has been widely applied in medical imaging. We found that the design principles of most existing methods are influenced by SR tasks based on real-world images and do not take into account the significance of the multi-level structure in pathological images, even if they can achieve respectable objective metric evaluations. In this work, we delve into two super-resolution working paradigms and propose a novel network called CWT-Net, which leverages cross-scale image wavelet transform and Transformer architecture. Our network consists of two branches: one dedicated to learning super-resolution and the other to high-frequency wavelet features. To generate high-resolution histopathology images, the Transformer module shares and fuses features from both branches at various stages. Notably, we have designed a specialized wavelet reconstruction module to effectively enhance the wavelet domain features and enable the network to operate in different modes, allowing for the introduction of additional relevant information from cross-scale images. Our experimental results demonstrate that our model significantly outperforms state-of-the-art methods in both performance and visualization evaluations and can substantially boost the accuracy of image diagnostic networks.
Paper Structure (21 sections, 14 equations, 6 figures, 6 tables)

This paper contains 21 sections, 14 equations, 6 figures, 6 tables.

Figures (6)

  • Figure 1: a. Digital medical images typically have multiple levels and dimensions, requiring higher generation and storage costs. Medical images with a pyramid structure can store images at multiple magnifications, facilitating healthcare professionals in their examination. However, existing SR methods have not taken the above-mentioned characteristics into account. b. To tackle this issue, we propose CWT-Net, which combines a multitasking strategy to expand the utilization of pyramid shaped data. c. CWT-Net aims to generate high-level digital medical images at a lower cost.
  • Figure 2: a. CWT-Net's Comprehensive Architecture: CWT-Net adopts a dual-branch architecture, each dedicated to distinct tasks. The SR branch (orange) focuses on transforming LR images into HR images. The WT branch (blue) is designed to capture wavelet features within HR images across multiple scales. Transformer blocks play a pivotal role in identifying analogous textures among cross-level wavelet features, which are then amalgamated with the SR branch's output. b. Wavelet Reconstruction (WR) Module: The WR module, featured here, integrates a channel attention mechanism characterized by deeper and wider inner channels to facilitate feature extraction.
  • Figure 3: The architecture of the Transformer block. In different network phases, features from two branches denoted Q, K, and V, enter the Transformer block. After transfer and embedding, these features are fused into the SR branch.
  • Figure 4: Qualitative results in the $2\times$ up-sampling task.
  • Figure 5: Qualitative results in the $4\times$ up-sampling task.
  • ...and 1 more figures