Table of Contents
Fetching ...

Restore-RWKV: Efficient and Effective Medical Image Restoration with RWKV

Zhiwen Yang, Jiayin Li, Hui Zhang, Dan Zhao, Bingzheng Wei, Yan Xu

TL;DR

The paper addresses the challenge of restoring high-quality medical images from degraded inputs while balancing a large receptive field with computational efficiency. It introduces Restore-RWKV, an RWKV-based backbone adapted to 2D medical images via Re-WKV attention for global dependencies and Omni-Shift for rich local interactions, implemented in a 4-level U-Net with skip connections. Across PET, CT, MRI, and all-in-one restoration tasks, Restore-RWKV achieves state-of-the-art results, with a lightweight variant outperforming several baselines and ablations confirming the critical roles of Re-WKV and Omni-Shift in expanding the effective receptive field. The approach offers a scalable, efficient backbone for high-resolution MedIR, enabling broader clinical deployment and multi-task restoration.

Abstract

Transformers have revolutionized medical image restoration, but the quadratic complexity still poses limitations for their application to high-resolution medical images. The recent advent of the Receptance Weighted Key Value (RWKV) model in the natural language processing field has attracted much attention due to its ability to process long sequences efficiently. To leverage its advanced design, we propose Restore-RWKV, the first RWKV-based model for medical image restoration. Since the original RWKV model is designed for 1D sequences, we make two necessary modifications for modeling spatial relations in 2D medical images. First, we present a recurrent WKV (Re-WKV) attention mechanism that captures global dependencies with linear computational complexity. Re-WKV incorporates bidirectional attention as basic for a global receptive field and recurrent attention to effectively model 2D dependencies from various scan directions. Second, we develop an omnidirectional token shift (Omni-Shift) layer that enhances local dependencies by shifting tokens from all directions and across a wide context range. These adaptations make the proposed Restore-RWKV an efficient and effective model for medical image restoration. Even a lightweight variant of Restore-RWKV, with only 1.16 million parameters, achieves comparable or even superior results compared to existing state-of-the-art (SOTA) methods. Extensive experiments demonstrate that the resulting Restore-RWKV achieves SOTA performance across a range of medical image restoration tasks, including PET image synthesis, CT image denoising, MRI image super-resolution, and all-in-one medical image restoration. Code is available at: https://github.com/Yaziwel/Restore-RWKV.

Restore-RWKV: Efficient and Effective Medical Image Restoration with RWKV

TL;DR

The paper addresses the challenge of restoring high-quality medical images from degraded inputs while balancing a large receptive field with computational efficiency. It introduces Restore-RWKV, an RWKV-based backbone adapted to 2D medical images via Re-WKV attention for global dependencies and Omni-Shift for rich local interactions, implemented in a 4-level U-Net with skip connections. Across PET, CT, MRI, and all-in-one restoration tasks, Restore-RWKV achieves state-of-the-art results, with a lightweight variant outperforming several baselines and ablations confirming the critical roles of Re-WKV and Omni-Shift in expanding the effective receptive field. The approach offers a scalable, efficient backbone for high-resolution MedIR, enabling broader clinical deployment and multi-task restoration.

Abstract

Transformers have revolutionized medical image restoration, but the quadratic complexity still poses limitations for their application to high-resolution medical images. The recent advent of the Receptance Weighted Key Value (RWKV) model in the natural language processing field has attracted much attention due to its ability to process long sequences efficiently. To leverage its advanced design, we propose Restore-RWKV, the first RWKV-based model for medical image restoration. Since the original RWKV model is designed for 1D sequences, we make two necessary modifications for modeling spatial relations in 2D medical images. First, we present a recurrent WKV (Re-WKV) attention mechanism that captures global dependencies with linear computational complexity. Re-WKV incorporates bidirectional attention as basic for a global receptive field and recurrent attention to effectively model 2D dependencies from various scan directions. Second, we develop an omnidirectional token shift (Omni-Shift) layer that enhances local dependencies by shifting tokens from all directions and across a wide context range. These adaptations make the proposed Restore-RWKV an efficient and effective model for medical image restoration. Even a lightweight variant of Restore-RWKV, with only 1.16 million parameters, achieves comparable or even superior results compared to existing state-of-the-art (SOTA) methods. Extensive experiments demonstrate that the resulting Restore-RWKV achieves SOTA performance across a range of medical image restoration tasks, including PET image synthesis, CT image denoising, MRI image super-resolution, and all-in-one medical image restoration. Code is available at: https://github.com/Yaziwel/Restore-RWKV.
Paper Structure (20 sections, 11 equations, 9 figures, 8 tables)

This paper contains 20 sections, 11 equations, 9 figures, 8 tables.

Figures (9)

  • Figure 1: The effective receptive field (ERF) luo2016erf visualization for different efficient models. A more extensively distributed dark area indicates a larger ERF. Our proposed Restore-RWKV achieves the most significant global ERF.
  • Figure 2: (a) Overview of the Restore-RWKV architecture. (b) Illustration of the R-RWKV block, which incorporates a Re-WKV attention mechanism to model global dependencies with linear complexity, and an Omni-Shift layer to capture local context.
  • Figure 3: Illustrations of the Re-WKV attention mechanism. Re-WKV employs Bi-WKV duan2024vrwkv as its basic attention operator and applies Bi-WKV attention to 2D images recurrently through various scan directions to better model global dependencies.
  • Figure 4: (a) Illustrations of different token shift mechanisms. The Uni-Shift peng2023rwkv fuses the current token with only the last (left) one by linear interpolation. The Quad-Shift duan2024vrwkv fuses the current token with four adjacent tokens by linear interpolation. Our proposed Omni-Shift fuses the current token with tokens from all directions by convolution. (b) Illustration of the Omni-Shift with structural re-parameterization.
  • Figure 5: Visual comparison of different methods in PET image synthesis. The zoomed-in rectangular region is recommended for a better comparison.
  • ...and 4 more figures