Table of Contents
Fetching ...

Goal Conditioned Reinforcement Learning for Photo Finishing Tuning

Jiarui Wu, Yujin Wang, Lingen Li, Zhang Fan, Tianfan Xue

TL;DR

The paper tackles automatic tuning of non-differentiable image processing pipelines to achieve target appearances or styles. It introduces a goal-conditioned reinforcement learning framework that treats the pipeline as a black box and employs a novel state representation (dual-path CNN features, photo statistics, and historical actions) along with two reward formulations (PSNR-based finishing and perceptual style-based stylization) trained via TD3. Results show the method reaches target outcomes with as few as 10 pipeline queries, outperforming zeroth- and first-order baselines, and generalizes to unseen datasets such as HDR+ and new style targets. This approach offers a practical, efficient alternative for photorealistic image tuning with flexible goal conditioning and broad applicability to various visual editing tasks.

Abstract

Photo finishing tuning aims to automate the manual tuning process of the photo finishing pipeline, like Adobe Lightroom or Darktable. Previous works either use zeroth-order optimization, which is slow when the set of parameters increases, or rely on a differentiable proxy of the target finishing pipeline, which is hard to train. To overcome these challenges, we propose a novel goal-conditioned reinforcement learning framework for efficiently tuning parameters using a goal image as a condition. Unlike previous approaches, our tuning framework does not rely on any proxy and treats the photo finishing pipeline as a black box. Utilizing a trained reinforcement learning policy, it can efficiently find the desired set of parameters within just 10 queries, while optimization based approaches normally take 200 queries. Furthermore, our architecture utilizes a goal image to guide the iterative tuning of pipeline parameters, allowing for flexible conditioning on pixel-aligned target images, style images, or any other visually representable goals. We conduct detailed experiments on photo finishing tuning and photo stylization tuning tasks, demonstrating the advantages of our method. Project website: https://openimaginglab.github.io/RLPixTuner/.

Goal Conditioned Reinforcement Learning for Photo Finishing Tuning

TL;DR

The paper tackles automatic tuning of non-differentiable image processing pipelines to achieve target appearances or styles. It introduces a goal-conditioned reinforcement learning framework that treats the pipeline as a black box and employs a novel state representation (dual-path CNN features, photo statistics, and historical actions) along with two reward formulations (PSNR-based finishing and perceptual style-based stylization) trained via TD3. Results show the method reaches target outcomes with as few as 10 pipeline queries, outperforming zeroth- and first-order baselines, and generalizes to unseen datasets such as HDR+ and new style targets. This approach offers a practical, efficient alternative for photorealistic image tuning with flexible goal conditioning and broad applicability to various visual editing tasks.

Abstract

Photo finishing tuning aims to automate the manual tuning process of the photo finishing pipeline, like Adobe Lightroom or Darktable. Previous works either use zeroth-order optimization, which is slow when the set of parameters increases, or rely on a differentiable proxy of the target finishing pipeline, which is hard to train. To overcome these challenges, we propose a novel goal-conditioned reinforcement learning framework for efficiently tuning parameters using a goal image as a condition. Unlike previous approaches, our tuning framework does not rely on any proxy and treats the photo finishing pipeline as a black box. Utilizing a trained reinforcement learning policy, it can efficiently find the desired set of parameters within just 10 queries, while optimization based approaches normally take 200 queries. Furthermore, our architecture utilizes a goal image to guide the iterative tuning of pipeline parameters, allowing for flexible conditioning on pixel-aligned target images, style images, or any other visually representable goals. We conduct detailed experiments on photo finishing tuning and photo stylization tuning tasks, demonstrating the advantages of our method. Project website: https://openimaginglab.github.io/RLPixTuner/.

Paper Structure

This paper contains 21 sections, 9 equations, 28 figures, 4 tables.

Figures (28)

  • Figure 1: In this work, we propose an RL-based photo finishing tuning algorithm that efficiently tunes the parameters of a black-box image processing pipeline to match any tuning target. The RL-based solution (top row) takes only about 10 iterations to achieve a similar PSNR as the 500-iteration output of a zeroth-order algorithm (bottom row). Our method demonstrates fast convergence, high quality, and no need for a proxy.
  • Figure 2: The overall framework. Top row: at each step, our policy maps the current image and the goal image to action (new parameters), with the help of our state representation consisting of dual-path features, photo statistics, and historical actions. Bottom row: visualization of iterative tuning trajectory of our RL-based photo finishing framework.
  • Figure 3: Photo finishing tuning results on FiveK dataset with expert C target. The visual results of our method are closest to the target image, especially in terms of color and brightness.
  • Figure 4: Photo stylization tuning results. Compared with CMAES hansen2006cmamosleh2020hardware, monolithic proxy tseng2019hyperparameter, and cascaded proxy tseng2022neural, our output matches the best with the style goal.
  • Figure 5: Qualitative comparison on the HDR$+$ photo finishing tuning task. These comparisons illustrate that our method remains closer to the target even when dealing with input and target images outside the training distribution.
  • ...and 23 more figures