Table of Contents
Fetching ...

Prompt-Guided Image-Adaptive Neural Implicit Lookup Tables for Interpretable Image Enhancement

Satoshi Kosugi

TL;DR

This work tackles the problem of interpretable yet high-quality image enhancement by moving beyond handcrafted or linear LUT-based edits. It introduces IA-NILUT, an Image‑Adaptive Neural Implicit Lookup Table where an MLP encodes a color transform and ingests image-adaptive parameters ${\bf w} \in \mathbb{R}^J$, with LUT bypassing enabling real-time application; a prompt guidance loss using CLIP pairs guides each filter to an intuitive human label while enforcing that each $w_j$ controls only its designated impression. The method is trained in three stages and evaluated on FiveK and PPR10K, showing superior performance to predefined filter baselines and competitive results against uninterpretable methods, while achieving strong interpretability thanks to sorted RGB inputs and the prompting framework. Overall, IA-NILUT provides a scalable, interpretable, and efficient mechanism for content-aware image enhancement with practical implications for on-device editing and user-facing photo tools.

Abstract

In this paper, we delve into the concept of interpretable image enhancement, a technique that enhances image quality by adjusting filter parameters with easily understandable names such as "Exposure" and "Contrast". Unlike using predefined image editing filters, our framework utilizes learnable filters that acquire interpretable names through training. Our contribution is two-fold. Firstly, we introduce a novel filter architecture called an image-adaptive neural implicit lookup table, which uses a multilayer perceptron to implicitly define the transformation from input feature space to output color space. By incorporating image-adaptive parameters directly into the input features, we achieve highly expressive filters. Secondly, we introduce a prompt guidance loss to assign interpretable names to each filter. We evaluate visual impressions of enhancement results, such as exposure and contrast, using a vision and language model along with guiding prompts. We define a constraint to ensure that each filter affects only the targeted visual impression without influencing other attributes, which allows us to obtain the desired filter effects. Experimental results show that our method outperforms existing predefined filter-based methods, thanks to the filters optimized to predict target results. Our source code is available at https://github.com/satoshi-kosugi/PG-IA-NILUT.

Prompt-Guided Image-Adaptive Neural Implicit Lookup Tables for Interpretable Image Enhancement

TL;DR

This work tackles the problem of interpretable yet high-quality image enhancement by moving beyond handcrafted or linear LUT-based edits. It introduces IA-NILUT, an Image‑Adaptive Neural Implicit Lookup Table where an MLP encodes a color transform and ingests image-adaptive parameters , with LUT bypassing enabling real-time application; a prompt guidance loss using CLIP pairs guides each filter to an intuitive human label while enforcing that each controls only its designated impression. The method is trained in three stages and evaluated on FiveK and PPR10K, showing superior performance to predefined filter baselines and competitive results against uninterpretable methods, while achieving strong interpretability thanks to sorted RGB inputs and the prompting framework. Overall, IA-NILUT provides a scalable, interpretable, and efficient mechanism for content-aware image enhancement with practical implications for on-device editing and user-facing photo tools.

Abstract

In this paper, we delve into the concept of interpretable image enhancement, a technique that enhances image quality by adjusting filter parameters with easily understandable names such as "Exposure" and "Contrast". Unlike using predefined image editing filters, our framework utilizes learnable filters that acquire interpretable names through training. Our contribution is two-fold. Firstly, we introduce a novel filter architecture called an image-adaptive neural implicit lookup table, which uses a multilayer perceptron to implicitly define the transformation from input feature space to output color space. By incorporating image-adaptive parameters directly into the input features, we achieve highly expressive filters. Secondly, we introduce a prompt guidance loss to assign interpretable names to each filter. We evaluate visual impressions of enhancement results, such as exposure and contrast, using a vision and language model along with guiding prompts. We define a constraint to ensure that each filter affects only the targeted visual impression without influencing other attributes, which allows us to obtain the desired filter effects. Experimental results show that our method outperforms existing predefined filter-based methods, thanks to the filters optimized to predict target results. Our source code is available at https://github.com/satoshi-kosugi/PG-IA-NILUT.
Paper Structure (22 sections, 12 equations, 15 figures, 6 tables)

This paper contains 22 sections, 12 equations, 15 figures, 6 tables.

Figures (15)

  • Figure 1: Overview of our interpretable image enhancement method. For a highly expressive filter architecture, we propose an IA-NILUT. By employing LUT bypassing, we can expedite the transformation process. Additionally, we introduce a prompt guidance loss to assign interpretable names to each filter. As our method provides an interpretable and learnable framework for enhancement, it outperforms other predefined filter-based methods in terms of performance.
  • Figure 2: Comparison between the 3D LUTs zeng2022learning, the IA-NILUT, and the IA-NILUT with the LUT bypassing.
  • Figure 2: Comparison of filter architecture using FiveK (480p).
  • Figure 3: Motivation for our prompt guidance loss.
  • Figure 4: Visualization of learned filter effects. Only certain parameters are varied while others are held constant at 0. The images on the left and right are samples from FiveK and PPR10K, respectively.
  • ...and 10 more figures