AV-GAN: Attention-Based Varifocal Generative Adversarial Network for Uneven Medical Image Translation

Zexin Li; Yiyang Lin; Zijie Fang; Shuyan Li; Xiu Li

AV-GAN: Attention-Based Varifocal Generative Adversarial Network for Uneven Medical Image Translation

Zexin Li, Yiyang Lin, Zijie Fang, Shuyan Li, Xiu Li

TL;DR

This work tackles histopathology virtual staining by translating H&E slides into MT and PAS stains while preserving tissue structure. It introduces AV-GAN, which combines an Attention-based Key Region Selection Module to target regions with high translation difficulty and a Varifocal Module with dual-resolution generators to separately model global and local features, reinforced by an H channel constraint via $L_H$. The model achieves state-of-the-art performance on H&E→MT and H&E→PAS, improving FID and maintaining structural fidelity as shown by CSS, with ablations confirming the benefits of ignoring parameter sharing, optimal key-region counts, and appropriate region sizes. Practically, this approach enables high-quality virtual staining without repeated staining and holds value for downstream diagnostic tasks and automated analysis in pathology.

Abstract

Different types of staining highlight different structures in organs, thereby assisting in diagnosis. However, due to the impossibility of repeated staining, we cannot obtain different types of stained slides of the same tissue area. Translating the slide that is easy to obtain (e.g., H&E) to slides of staining types difficult to obtain (e.g., MT, PAS) is a promising way to solve this problem. However, some regions are closely connected to other regions, and to maintain this connection, they often have complex structures and are difficult to translate, which may lead to wrong translations. In this paper, we propose the Attention-Based Varifocal Generative Adversarial Network (AV-GAN), which solves multiple problems in pathologic image translation tasks, such as uneven translation difficulty in different regions, mutual interference of multiple resolution information, and nuclear deformation. Specifically, we develop an Attention-Based Key Region Selection Module, which can attend to regions with higher translation difficulty. We then develop a Varifocal Module to translate these regions at multiple resolutions. Experimental results show that our proposed AV-GAN outperforms existing image translation methods with two virtual kidney tissue staining tasks and improves FID values by 15.9 and 4.16 respectively in the H&E-MT and H&E-PAS tasks.

AV-GAN: Attention-Based Varifocal Generative Adversarial Network for Uneven Medical Image Translation

TL;DR

. The model achieves state-of-the-art performance on H&E→MT and H&E→PAS, improving FID and maintaining structural fidelity as shown by CSS, with ablations confirming the benefits of ignoring parameter sharing, optimal key-region counts, and appropriate region sizes. Practically, this approach enables high-quality virtual staining without repeated staining and holds value for downstream diagnostic tasks and automated analysis in pathology.

Abstract

Paper Structure (26 sections, 6 equations, 6 figures, 7 tables)

This paper contains 26 sections, 6 equations, 6 figures, 7 tables.

Introduction
Related Work
Virtual Staining in Histopathological Analysis
Generative Adversarial Network
Method
Attention-based Key Region Selection Module
Varifocal Module
Discriminator
RGB2HED
Loss function
H channel loss
Varifocal loss
Optimization objective
Experiments and Results
Dataset and Data Preprocessing
...and 11 more sections

Figures (6)

Figure 1: A H&E stained patch and its attention map. The attention map shows that different regions deserve different extents of attention and structures like the edge of cavities in this patch deserve more attention, which corresponds to the fact that the edge is crucial for the shape of translated images.
Figure 2: The structure of AV-GAN. $G_1$ and $G_2$ refer to the low-resolution generator and the high-resolution one respectively. $D_1$ and $D_2$ are the discriminators of high-resolution and low-resolution, which are not drawn in the figure. The Attention-Based Key Region Selection Module selects the region that is worth attention and the RGB2HED Block converts the RGB image to an HED image, whose H channel (nuclear channel) can be constrained.
Figure 3: The structure of the attention module. The $Q$, $K$, and $V$ refer to the Query, Key, and Value network. The matrix $A$ represents the attention map, which is calculated through $Q$ and $K$. The network will select the region with key structures that need to be translated more precisely and return a cord list with each coordinate group representing a region.
Figure 4: Image translation results. The first line is the original H&E-stained images, the second line, the third line and the fourth line are the results of image translation using UGATIT, CycleGAN, and AI-FFPE models, and the fifth to seventh lines are the results of image translation using AV-GAN with different numbers of regions. Our task is to translate H&E staining into MT staining and PAS staining.
Figure 5: The leftmost and rightmost are MT patches stained by AI-FFPE and a real patch of the same patient's MT patches (the dataset is an unpaired dataset). A blue zone appears at the location the red arrow points at, while the right patch indicates that there should be no blue in the patient's glomerular.
...and 1 more figures

AV-GAN: Attention-Based Varifocal Generative Adversarial Network for Uneven Medical Image Translation

TL;DR

Abstract

AV-GAN: Attention-Based Varifocal Generative Adversarial Network for Uneven Medical Image Translation

Authors

TL;DR

Abstract

Table of Contents

Figures (6)