Table of Contents
Fetching ...

Attentive Illumination Decomposition Model for Multi-Illuminant White Balancing

Dongyoung Kim, Jinwoo Kim, Junsang Yu, Seon Joo Kim

TL;DR

The paper tackles white balance under scenes with multiple, spatially varying illuminants by introducing Attentive Illumination Decomposition (AID), a framework that uses slot attention to learn separate illuminant chromaticities and pixel-wise weight maps, which are then fused under a physically grounded Lambertian model. AID uniquely decomposes illumination into individual components, enabling illuminant-aware editing and providing per-illuminant information such as color and count, while guaranteeing the linear mixing constraint via the learned representations. The authors propose a centroid-based matching loss to encourage slot specialization and prevent indiscriminate slot activation. Across LSMI, MIIW, and NUS-8, AID achieves state-of-the-art performance for both single- and multi-illuminant WB, with robust generalization and new capabilities for controllable WB and illumination editing, marking a step toward interpretable, physics-informed image enhancement.

Abstract

White balance (WB) algorithms in many commercial cameras assume single and uniform illumination, leading to undesirable results when multiple lighting sources with different chromaticities exist in the scene. Prior research on multi-illuminant WB typically predicts illumination at the pixel level without fully grasping the scene's actual lighting conditions, including the number and color of light sources. This often results in unnatural outcomes lacking in overall consistency. To handle this problem, we present a deep white balancing model that leverages the slot attention, where each slot is in charge of representing individual illuminants. This design enables the model to generate chromaticities and weight maps for individual illuminants, which are then fused to compose the final illumination map. Furthermore, we propose the centroid-matching loss, which regulates the activation of each slot based on the color range, thereby enhancing the model to separate illumination more effectively. Our method achieves the state-of-the-art performance on both single- and multi-illuminant WB benchmarks, and also offers additional information such as the number of illuminants in the scene and their chromaticity. This capability allows for illumination editing, an application not feasible with prior methods.

Attentive Illumination Decomposition Model for Multi-Illuminant White Balancing

TL;DR

The paper tackles white balance under scenes with multiple, spatially varying illuminants by introducing Attentive Illumination Decomposition (AID), a framework that uses slot attention to learn separate illuminant chromaticities and pixel-wise weight maps, which are then fused under a physically grounded Lambertian model. AID uniquely decomposes illumination into individual components, enabling illuminant-aware editing and providing per-illuminant information such as color and count, while guaranteeing the linear mixing constraint via the learned representations. The authors propose a centroid-based matching loss to encourage slot specialization and prevent indiscriminate slot activation. Across LSMI, MIIW, and NUS-8, AID achieves state-of-the-art performance for both single- and multi-illuminant WB, with robust generalization and new capabilities for controllable WB and illumination editing, marking a step toward interpretable, physics-informed image enhancement.

Abstract

White balance (WB) algorithms in many commercial cameras assume single and uniform illumination, leading to undesirable results when multiple lighting sources with different chromaticities exist in the scene. Prior research on multi-illuminant WB typically predicts illumination at the pixel level without fully grasping the scene's actual lighting conditions, including the number and color of light sources. This often results in unnatural outcomes lacking in overall consistency. To handle this problem, we present a deep white balancing model that leverages the slot attention, where each slot is in charge of representing individual illuminants. This design enables the model to generate chromaticities and weight maps for individual illuminants, which are then fused to compose the final illumination map. Furthermore, we propose the centroid-matching loss, which regulates the activation of each slot based on the color range, thereby enhancing the model to separate illumination more effectively. Our method achieves the state-of-the-art performance on both single- and multi-illuminant WB benchmarks, and also offers additional information such as the number of illuminants in the scene and their chromaticity. This capability allows for illumination editing, an application not feasible with prior methods.
Paper Structure (26 sections, 10 equations, 5 figures, 7 tables)

This paper contains 26 sections, 10 equations, 5 figures, 7 tables.

Figures (5)

  • Figure 1: Comparison of the AID framework (bottom) with existing approaches (top). Previous methodologies did not individually consider illuminant profiles within the scene, resulting in unnatural results. The AID framework outperforms previous works in illumination estimation by estimating the chromaticity and pixel-wise weight map of each individual illuminant and combining them.
  • Figure 2: (a) Overview of our framework. Image feature is extracted from the input using an U-Net encoder. Next, the slot attention adaptively calibrates slot representation to be bound with illuminant chromaticity in each scene. Finally, the model fuses the chromaticity and the weight map to generate the mixed illumination map. (b) Detailed generation flow of weight maps and calibrated slots, where Q-Softmax denotes softmax application on the query dimension. (c) Illustration of the slot-wise loss using the centroid based Hungarian matching under $K=4, N=2$ assumption.
  • Figure 3: Slot calibration process. The chromaticity $\boldsymbol{\hat{\ell}}_k$ and weight map $\hat{\alpha}_k$ generated from each $slots_n$ are iteratively calibrated to their ground truth values.
  • Figure 4: Qualitative comparison using LSMI test set. Top three rows show original raw image and corresponding WB results. The last two rows show the sRGB input images and corresponding illumination maps. The two rightmost columns demonstrate that our model, which infers illuminant-wise chromaticity and spatially mixes them, leads to more stable illumination plots compared to previous approaches. The x-axis and y-axis of the plot represent the ratio of the illumination value of the R and B channels to the value of the G channel.
  • Figure 5: Further applications of AID framework on LSMI test set examples. The separated weight map and the corresponding illuminant chromaticity (Decomp) allow for individual white balance to be applied to each light (WB Illum1,2), and for the chromaticity to be adjusted as desired (Illum manip). Full WB shows the results of applying white balance to all illuminants for reference. Gamma was adjusted for all images to increase visibility, and the G channel was scaled down for the decomposed illumination map visualization.