Table of Contents
Fetching ...

Distilling Semantic Priors from SAM to Efficient Image Restoration Models

Quan Zhang, Xiaoyu Liu, Wei Li, Hanting Chen, Junchao Liu, Jie Hu, Zhiwei Xiong, Chun Yuan, Yunhe Wang

TL;DR

The paper tackles the high inference cost of leveraging the Segment Anything Model (SAM) for image restoration by introducing a training-time distillation framework. It couples Semantic Priors Fusion (SPF) with Semantic Priors Distillation (SPD), guided by a Semantic-Guided Relation (SGR) module, to transfer SAM-derived semantic priors into existing IR models without altering their inference path. The approach uses cascaded IR networks during training, with SAM and a VGG-based semantic analyzer frozen, and distills the fused priors back into the first IR model via losses $\mathcal{L}_{SPD}$ and $\mathcal{L}_{SGR}$, optimizing $\mathcal{L} = \mathcal{L}_{recon}^1 + \lambda_1 \mathcal{L}_{SPD} + \lambda_2 \mathcal{L}_{SGR}$. Experiments across deraining, deblurring, and denoising datasets show consistent improvements in PSNR, SSIM, and downstream segmentation metrics, validating the framework’s generality and its advantage in maintaining inference efficiency while exploiting SAM’s rich semantic priors.

Abstract

In image restoration (IR), leveraging semantic priors from segmentation models has been a common approach to improve performance. The recent segment anything model (SAM) has emerged as a powerful tool for extracting advanced semantic priors to enhance IR tasks. However, the computational cost of SAM is prohibitive for IR, compared to existing smaller IR models. The incorporation of SAM for extracting semantic priors considerably hampers the model inference efficiency. To address this issue, we propose a general framework to distill SAM's semantic knowledge to boost exiting IR models without interfering with their inference process. Specifically, our proposed framework consists of the semantic priors fusion (SPF) scheme and the semantic priors distillation (SPD) scheme. SPF fuses two kinds of information between the restored image predicted by the original IR model and the semantic mask predicted by SAM for the refined restored image. SPD leverages a self-distillation manner to distill the fused semantic priors to boost the performance of original IR models. Additionally, we design a semantic-guided relation (SGR) module for SPD, which ensures semantic feature representation space consistency to fully distill the priors. We demonstrate the effectiveness of our framework across multiple IR models and tasks, including deraining, deblurring, and denoising.

Distilling Semantic Priors from SAM to Efficient Image Restoration Models

TL;DR

The paper tackles the high inference cost of leveraging the Segment Anything Model (SAM) for image restoration by introducing a training-time distillation framework. It couples Semantic Priors Fusion (SPF) with Semantic Priors Distillation (SPD), guided by a Semantic-Guided Relation (SGR) module, to transfer SAM-derived semantic priors into existing IR models without altering their inference path. The approach uses cascaded IR networks during training, with SAM and a VGG-based semantic analyzer frozen, and distills the fused priors back into the first IR model via losses and , optimizing . Experiments across deraining, deblurring, and denoising datasets show consistent improvements in PSNR, SSIM, and downstream segmentation metrics, validating the framework’s generality and its advantage in maintaining inference efficiency while exploiting SAM’s rich semantic priors.

Abstract

In image restoration (IR), leveraging semantic priors from segmentation models has been a common approach to improve performance. The recent segment anything model (SAM) has emerged as a powerful tool for extracting advanced semantic priors to enhance IR tasks. However, the computational cost of SAM is prohibitive for IR, compared to existing smaller IR models. The incorporation of SAM for extracting semantic priors considerably hampers the model inference efficiency. To address this issue, we propose a general framework to distill SAM's semantic knowledge to boost exiting IR models without interfering with their inference process. Specifically, our proposed framework consists of the semantic priors fusion (SPF) scheme and the semantic priors distillation (SPD) scheme. SPF fuses two kinds of information between the restored image predicted by the original IR model and the semantic mask predicted by SAM for the refined restored image. SPD leverages a self-distillation manner to distill the fused semantic priors to boost the performance of original IR models. Additionally, we design a semantic-guided relation (SGR) module for SPD, which ensures semantic feature representation space consistency to fully distill the priors. We demonstrate the effectiveness of our framework across multiple IR models and tasks, including deraining, deblurring, and denoising.
Paper Structure (16 sections, 8 equations, 11 figures, 7 tables)

This paper contains 16 sections, 8 equations, 11 figures, 7 tables.

Figures (11)

  • Figure 1: Comparison of training and inference pipelines between different manners of exploiting semantic priors from SAM. (a) Existing methods require the use of SAM at both the training and inference stages. (b) Our method only uses SAM at the training stage and preserves the same inference efficiency as the original image restoration model at the inference stage.
  • Figure 2: The workflow of our proposed framework to distill semantic knowledge from SAM to boost existing IR models without interfering with their inference process.
  • Figure 3: Architecture of the Semantic Prior Fusion (SPF) unit.
  • Figure 4: The qualitative comparison of IR models with and without our framework on various deraining datasets.
  • Figure 5: The qualitative comparison of IR models with and without our framework on the cityscape datasets.
  • ...and 6 more figures