Table of Contents
Fetching ...

Contribution-based Low-Rank Adaptation with Pre-training Model for Real Image Restoration

Donwon Park, Hayeon Kim, Se Young Chun

TL;DR

This work addresses the high cost of adapting pre-trained models to multiple real-world image degradation tasks by introducing PROD, a pre-training strategy that uses random order degradations to expand degradation representations, and CoLoRA, a contribution-based low-rank adapter that tunes only a small, layer-aware subset of parameters. FAIG is used to quantify layer contributions, guiding a per-layer rank adjustment δ_i that allocates about 7% of parameters to task-specific adapters, while leaving the base model frozen. Across six real IR tasks and two backbone architectures (CNN-based NAFNet and transformer-based Restormer), PROD + CoLoRA achieves competitive or superior PSNR compared to full fine-tuning, with substantially lower memory and storage requirements, demonstrating strong practical potential for on-device IR. The method’s scalability and robustness are supported by ablations on α/β scaling and by analyses showing PROD improves degradation representations even without fine-tuning, underscoring its applicability to diverse real-world degradations.

Abstract

Recently, pre-trained model and efficient parameter tuning have achieved remarkable success in natural language processing and high-level computer vision with the aid of masked modeling and prompt tuning. In low-level computer vision, however, there have been limited investigations on pre-trained models and even efficient fine-tuning strategy has not yet been explored despite its importance and benefit in various real-world tasks such as alleviating memory inflation issue when integrating new tasks on AI edge devices. Here, we propose a novel efficient parameter tuning approach dubbed contribution-based low-rank adaptation (CoLoRA) for multiple image restorations along with effective pre-training method with random order degradations (PROD). Unlike prior arts that tune all network parameters, our CoLoRA effectively fine-tunes small amount of parameters by leveraging LoRA (low-rank adaptation) for each new vision task with our contribution-based method to adaptively determine layer by layer capacity for that task to yield comparable performance to full tuning. Furthermore, our PROD strategy allows to extend the capability of pre-trained models with improved performance as well as robustness to bridge synthetic pre-training and real-world fine-tuning. Our CoLoRA with PROD has demonstrated its superior performance in various image restoration tasks across diverse degradation types on both synthetic and real-world datasets for known and novel tasks.

Contribution-based Low-Rank Adaptation with Pre-training Model for Real Image Restoration

TL;DR

This work addresses the high cost of adapting pre-trained models to multiple real-world image degradation tasks by introducing PROD, a pre-training strategy that uses random order degradations to expand degradation representations, and CoLoRA, a contribution-based low-rank adapter that tunes only a small, layer-aware subset of parameters. FAIG is used to quantify layer contributions, guiding a per-layer rank adjustment δ_i that allocates about 7% of parameters to task-specific adapters, while leaving the base model frozen. Across six real IR tasks and two backbone architectures (CNN-based NAFNet and transformer-based Restormer), PROD + CoLoRA achieves competitive or superior PSNR compared to full fine-tuning, with substantially lower memory and storage requirements, demonstrating strong practical potential for on-device IR. The method’s scalability and robustness are supported by ablations on α/β scaling and by analyses showing PROD improves degradation representations even without fine-tuning, underscoring its applicability to diverse real-world degradations.

Abstract

Recently, pre-trained model and efficient parameter tuning have achieved remarkable success in natural language processing and high-level computer vision with the aid of masked modeling and prompt tuning. In low-level computer vision, however, there have been limited investigations on pre-trained models and even efficient fine-tuning strategy has not yet been explored despite its importance and benefit in various real-world tasks such as alleviating memory inflation issue when integrating new tasks on AI edge devices. Here, we propose a novel efficient parameter tuning approach dubbed contribution-based low-rank adaptation (CoLoRA) for multiple image restorations along with effective pre-training method with random order degradations (PROD). Unlike prior arts that tune all network parameters, our CoLoRA effectively fine-tunes small amount of parameters by leveraging LoRA (low-rank adaptation) for each new vision task with our contribution-based method to adaptively determine layer by layer capacity for that task to yield comparable performance to full tuning. Furthermore, our PROD strategy allows to extend the capability of pre-trained models with improved performance as well as robustness to bridge synthetic pre-training and real-world fine-tuning. Our CoLoRA with PROD has demonstrated its superior performance in various image restoration tasks across diverse degradation types on both synthetic and real-world datasets for known and novel tasks.
Paper Structure (27 sections, 3 equations, 15 figures, 9 tables)

This paper contains 27 sections, 3 equations, 15 figures, 9 tables.

Figures (15)

  • Figure 1: Illustrations of tuning strategies for novel image restoration tasks. (a) Existing strategies chen2021preli2021efficientliu2023degae for fully fine-tuning a pre-trained model for a new task. (b) Our proposed CoLoRA method enables parameter-efficient fine-tuning by freezing the pre-trained model and adjusting the additional adapter for novel image restoration tasks.
  • Figure 2: The overview of our proposed CoLoRA with PROD. (a) Our PROD leverages high-quality clean images and synthetic degraded low-quality images for pre-training the model. (b) Our proposed Contribution based efficient LoRA (CoLoRA) for new IR tasks. The proposed CoLoRA is configured to have different ratio of learnable network parameter ($\delta$) for each layer based on quantified contributions (Sec \ref{['sec:observation']}), enabling efficient fine-tuning for new tasks. (c) CoLoRA can be adjusted according to contribution.
  • Figure 3: (a) For each layer, we measured the FAIG score using a pre-trained model and a fine-tuned model specifically tuned for the specific task to observe its contribution. (b) Experimental results according to fine-tuning location for a blur task. The encoder and decoder occupy 18% of the total network parameters, while the middle layers account for 80%. The encoder and decoder have higher FAIG scores than the middle. Bias and normalization have higher FAIG values compared to the weight layer (Conv).
  • Figure 4: Performance comparison based on the scale of training data for 6 IR tasks. In the graph, the results of the 6 IR tasks are averaged for comparison. The x-axis represents the number of training data, and the y-axis is the average PSNR. In the radar graph, we compare the results of 6 IR tasks with Normalized PSNR at a training data size of 128. (a) and (b) present experimental results corresponding to pre-training and fine-tuning methods, respectively. (c) and (d) experimental results for the Our CoLoRA with PROD in NAFNet and Restormer. Our proposed CoLoRA (7%) has much fewer tuned network parameters compared to the full fine-tuning (100%) of NAFNet.
  • Figure 5: Qualitative results evaluated on the 6 IR tasks for our proposed method, generic Random initial + Full tuning and DegAE + Full tuning. Our methods with partial and full tuning yielded visually excellent results for the real IR task, outperforming others.
  • ...and 10 more figures