Table of Contents
Fetching ...

InstantRetouch: Personalized Image Retouching without Test-time Fine-tuning Using an Asymmetric Auto-Encoder

Temesgen Muruts Weldengus, Binnan Liu, Fei Kou, Youwei Lyu, Jinwei Chen, Qingnan Fan, Changqing Zou

TL;DR

InstantRetouch addresses the challenge of personalized image retouching without test-time fine-tuning by learning an asymmetric auto-encoder that encodes user retouching style into a content-disentangled latent, enabling faithful style transfer to new images. The Retrieval-Augmented Retouching (RAR) module further personalizes outputs by content-aware aggregation of style latents from the most similar reference pairs. Empirical results across VCIRB, PPR10K-Groups, and MIT-FiveK/PPR10K show superior performance over existing methods, and the framework naturally supports photorealistic style transfer without paired training. A visually consistent Lightroom-based dataset and a principled reference-sampling strategy underpin robust meta-learning, highlighting the approach’s practicality for real-world, few-shot personalization.

Abstract

Personalized image retouching aims to adapt retouching style of individual users from reference examples, but existing methods often require user-specific fine-tuning or fail to generalize effectively. To address these challenges, we introduce $\textbf{InstantRetouch}$, a general framework for personalized image retouching that instantly adapts to user retouching styles without any test-time fine-tuning. It employs an $\textit{asymmetric auto-encoder}$ to encode the retouching style from paired examples into a content disentangled latent representation that enables faithful transfer of the retouching style to new images. To adaptively apply the encoded retouching style to new images, we further propose $\textit{retrieval-augmented retouching}$ (RAR), which retrieves and aggregates style latents from reference pairs most similar in content to the query image. With these components, $\textbf{InstantRetouch}$ enables superior and generic content-aware retouching personalization across diverse scenarios, including single-reference, multi-reference, and mixed-style setups, while also generalizing out of the box to photorealistic style transfer.

InstantRetouch: Personalized Image Retouching without Test-time Fine-tuning Using an Asymmetric Auto-Encoder

TL;DR

InstantRetouch addresses the challenge of personalized image retouching without test-time fine-tuning by learning an asymmetric auto-encoder that encodes user retouching style into a content-disentangled latent, enabling faithful style transfer to new images. The Retrieval-Augmented Retouching (RAR) module further personalizes outputs by content-aware aggregation of style latents from the most similar reference pairs. Empirical results across VCIRB, PPR10K-Groups, and MIT-FiveK/PPR10K show superior performance over existing methods, and the framework naturally supports photorealistic style transfer without paired training. A visually consistent Lightroom-based dataset and a principled reference-sampling strategy underpin robust meta-learning, highlighting the approach’s practicality for real-world, few-shot personalization.

Abstract

Personalized image retouching aims to adapt retouching style of individual users from reference examples, but existing methods often require user-specific fine-tuning or fail to generalize effectively. To address these challenges, we introduce , a general framework for personalized image retouching that instantly adapts to user retouching styles without any test-time fine-tuning. It employs an to encode the retouching style from paired examples into a content disentangled latent representation that enables faithful transfer of the retouching style to new images. To adaptively apply the encoded retouching style to new images, we further propose (RAR), which retrieves and aggregates style latents from reference pairs most similar in content to the query image. With these components, enables superior and generic content-aware retouching personalization across diverse scenarios, including single-reference, multi-reference, and mixed-style setups, while also generalizing out of the box to photorealistic style transfer.
Paper Structure (16 sections, 4 equations, 11 figures, 6 tables, 1 algorithm)

This paper contains 16 sections, 4 equations, 11 figures, 6 tables, 1 algorithm.

Figures (11)

  • Figure 1: Our proposed method effectively personalizes image retouching from single or multiple references of similar or diverse styles at inference time, with no test-time finetuning. It also supports photorealistic style transfer out of the box.
  • Figure 2: Image retouching style transfer from a single reference pair. Top: input–output reference pair defining the retouching style. Second row: query input images. Third row: our method’s results. Bottom: ground-truth retouched images.
  • Figure 3: Illustration of our Asymmetric Auto-encoder. The asymmetric auto-encoder is the core component of our approach, comprising a LoRA-adapted Siamese encoder and a lightweight conditional MLP decoder. The encoder processes input-output pairs to learn a content-disentangled retouching style representation, encoding it into a compact latent vector. The conditional MLP decoder then applies this latent representation to reconstruct the retouched image from the original input, operating in color space to preserve content while applying the desired retouching style.
  • Figure 4: PPR10K groups performance comparison when three references are used.
  • Figure 5: Photorealistic style transfer comparison. Our method can be used for photorealistic style transfer, applying color and tone from a single unpaired reference out of the box.
  • ...and 6 more figures