Table of Contents
Fetching ...

Optimizing Illuminant Estimation in Dual-Exposure HDR Imaging

Mahmoud Afifi, Zhenhua Hu, Liang Liang

TL;DR

This paper tackles illuminant estimation in dual-exposure HDR imaging by introducing a compact dual-exposure feature (DEF) computed from long ($I_l$) and short ($I_s$) exposure frames. DEF guides two lightweight estimators: an exposure-based MLP (EMLP) that takes DEF as input, and an exposure-based CCC (ECCC) that dynamically biases a two-histogram CCC using DEF-driven interpolation of learnable biases. On a newly collected multi-exposure HDR dataset, DEF-enabled models achieve competitive accuracy with far fewer parameters (EMLP ~354 params; ECCC ~6,156 params) and show that ensemble predictions can surpass state-of-the-art methods while remaining computationally efficient. The work demonstrates practical applicability for on-device white balance in camera ISPs and suggests further exploration of cross-camera stability and spatially varying DEF variants for real-world HDR pipelines. $I_l$ and $I_s$ are used to derive the DEF, which in turn informs the illuminant estimate $oldsymbol{\, ext{ell}}$ through the proposed models.$

Abstract

High dynamic range (HDR) imaging involves capturing a series of frames of the same scene, each with different exposure settings, to broaden the dynamic range of light. This can be achieved through burst capturing or using staggered HDR sensors that capture long and short exposures simultaneously in the camera image signal processor (ISP). Within camera ISP pipeline, illuminant estimation is a crucial step aiming to estimate the color of the global illuminant in the scene. This estimation is used in camera ISP white-balance module to remove undesirable color cast in the final image. Despite the multiple frames captured in the HDR pipeline, conventional illuminant estimation methods often rely only on a single frame of the scene. In this paper, we explore leveraging information from frames captured with different exposure times. Specifically, we introduce a simple feature extracted from dual-exposure images to guide illuminant estimators, referred to as the dual-exposure feature (DEF). To validate the efficiency of DEF, we employed two illuminant estimators using the proposed DEF: 1) a multilayer perceptron network (MLP), referred to as exposure-based MLP (EMLP), and 2) a modified version of the convolutional color constancy (CCC) to integrate our DEF, that we call ECCC. Both EMLP and ECCC achieve promising results, in some cases surpassing prior methods that require hundreds of thousands or millions of parameters, with only a few hundred parameters for EMLP and a few thousand parameters for ECCC.

Optimizing Illuminant Estimation in Dual-Exposure HDR Imaging

TL;DR

This paper tackles illuminant estimation in dual-exposure HDR imaging by introducing a compact dual-exposure feature (DEF) computed from long () and short () exposure frames. DEF guides two lightweight estimators: an exposure-based MLP (EMLP) that takes DEF as input, and an exposure-based CCC (ECCC) that dynamically biases a two-histogram CCC using DEF-driven interpolation of learnable biases. On a newly collected multi-exposure HDR dataset, DEF-enabled models achieve competitive accuracy with far fewer parameters (EMLP ~354 params; ECCC ~6,156 params) and show that ensemble predictions can surpass state-of-the-art methods while remaining computationally efficient. The work demonstrates practical applicability for on-device white balance in camera ISPs and suggests further exploration of cross-camera stability and spatially varying DEF variants for real-world HDR pipelines. and are used to derive the DEF, which in turn informs the illuminant estimate through the proposed models.$

Abstract

High dynamic range (HDR) imaging involves capturing a series of frames of the same scene, each with different exposure settings, to broaden the dynamic range of light. This can be achieved through burst capturing or using staggered HDR sensors that capture long and short exposures simultaneously in the camera image signal processor (ISP). Within camera ISP pipeline, illuminant estimation is a crucial step aiming to estimate the color of the global illuminant in the scene. This estimation is used in camera ISP white-balance module to remove undesirable color cast in the final image. Despite the multiple frames captured in the HDR pipeline, conventional illuminant estimation methods often rely only on a single frame of the scene. In this paper, we explore leveraging information from frames captured with different exposure times. Specifically, we introduce a simple feature extracted from dual-exposure images to guide illuminant estimators, referred to as the dual-exposure feature (DEF). To validate the efficiency of DEF, we employed two illuminant estimators using the proposed DEF: 1) a multilayer perceptron network (MLP), referred to as exposure-based MLP (EMLP), and 2) a modified version of the convolutional color constancy (CCC) to integrate our DEF, that we call ECCC. Both EMLP and ECCC achieve promising results, in some cases surpassing prior methods that require hundreds of thousands or millions of parameters, with only a few hundred parameters for EMLP and a few thousand parameters for ECCC.
Paper Structure (14 sections, 21 equations, 9 figures, 2 tables)

This paper contains 14 sections, 21 equations, 9 figures, 2 tables.

Figures (9)

  • Figure 1: Conventional illuminant estimators often rely on a single frame for illuminant color estimation. Although the HDR camera pipeline includes at least two frames of the same scene, conventional methods usually consider only a single frame (either with auto exposure, shown in the dashed green line, merged frame, or one frame of the burst capture) HDR+yahiaoui2019overviewdelbracio2021mobileMSCHAPTER. This paper proposes leveraging information from dual-exposure capturing in multi-exposure HDR imaging to enhance illuminant estimation in camera pipelines. Our method uses frames captured at long and short exposures, available in multi-exposure bursts HDR-on-mobile or staggered HDR sensors shdr1shdr2. Achieving comparable or superior results, our method employs lightweight models ($\sim$300--6000 parameters) compared to those using hundreds of thousands or millions of parameters. In this figure and the following figures, all raw images have the gamma operator applied to aid visualization, and all sRGB images are rendered using the HDR+ camera pipeline HDR+.
  • Figure 2: Cameras perceive different amount of photons when capturing the same scene with different exposure times. Images taken with both long and short exposure times exhibit variations in each channel due to the camera response function and scene irradiance. Additionally, spatial variations, in each color channel, can be observed based on object reflectance, as the interplay of lighting, object reflectance, and the camera response function leads to different outcomes. (A) and (B) show raw images of scenes captured under indoor and outdoor lighting, respectively. In (C) and (D), we present the average $rg$-chromaticity histogram and aggregated red, green, and blue pixel values from 25 images sharing similar lighting conditions in (A) and (B), respectively.
  • Figure 3: We present an illuminant-related dual-exposure feature (DEF), derived from a pair of images captured with short and long exposures. Using DEF, we deploy a simple multilayer perceptron network (MLP) with only 354 parameters, referred to as exposure-based MLP or EMLP, for illuminant estimation, as shown in (A). We further explore the integration of DEF into the CCC framework, as shown in (B), by dynamically generating bias map based on DEF. We denote this modified CCC framework as exposure-based CCC or ECCC.
  • Figure 4: Examples from the dataset used in this work. For each scene, we captured the scene with a gray calibration object placed in the scene to obtain the ground-truth illuminant (A) and captured the scene using different exposure settings without the gray object (B-H). The terms 'short $/ e$' (C-E) and 'long $\times e$' (F-H) refer to multiplying and dividing auto exposure time by a factor $e$, respectively. The first image in (A) is displayed in sRGB, while the rest are shown in raw RGB space.
  • Figure 5: Randomly selected examples from our worst 25% results of ECCC (top two rows are from validation set and remaining rows are from testing set). (A) Input pair of raw images captured with long and short exposures (note that other methods use a single image captured with auto exposure). (B-G) Images corrected with the estimated illuminant by: (B) FFCC FFCC, (C) C5 C5 (chosen the best results among all variations discussed in Sec. \ref{['sec:resutls']}), (D) TLCC TLCC, (E) EMLP, (F) ECCC, and (G) Ground-truth illuminant. The estimated illuminant of each method is shown on the right side of each image, along with the angular error written in the top-left corner of the image.
  • ...and 4 more figures