Table of Contents
Fetching ...

MixLight: Borrowing the Best of both Spherical Harmonics and Gaussian Models

Xinlong Ji, Fangneng Zhan, Shijian Lu, Shi-Sheng Huang, Hua Huang

TL;DR

MixLight tackles the challenge of estimating HDR scene illumination from a single limited-FOV image by jointly leveraging SH for low-frequency ambient lighting and SG for high-frequency light sources, augmented with a novel SLSparsemax sparsity mechanism. This combination addresses the limitations of purely SH or SG representations and mitigates over-smoothing or over-sparsification common in prior methods. Through comprehensive experiments on the Laval Indoor HDR dataset and a diverse Web Dataset, MixLight demonstrates superior accuracy (RMSE and si-RMSE) and better generalization, underscoring the value of a sparsity-aware, frequency-sweeping parametric illumination model for realistic rendering in mixed reality. The approach offers a practical, efficient alternative to high-dimensional illumination maps, with strong potential for indoor applications and future extension to outdoor and spatially-varying illumination scenarios.

Abstract

Accurately estimating scene lighting is critical for applications such as mixed reality. Existing works estimate illumination by generating illumination maps or regressing illumination parameters. However, the method of generating illumination maps has poor generalization performance and parametric models such as Spherical Harmonic (SH) and Spherical Gaussian (SG) fall short in capturing high-frequency or low-frequency components. This paper presents MixLight, a joint model that utilizes the complementary characteristics of SH and SG to achieve a more complete illumination representation, which uses SH and SG to capture low-frequency ambient and high-frequency light sources respectively. In addition, a special spherical light source sparsemax (SLSparsemax) module that refers to the position and brightness relationship between spherical light sources is designed to improve their sparsity, which is significant but omitted by prior works. Extensive experiments demonstrate that MixLight surpasses state-of-the-art (SOTA) methods on multiple metrics. In addition, experiments on Web Dataset also show that MixLight as a parametric method has better generalization performance than non-parametric methods.

MixLight: Borrowing the Best of both Spherical Harmonics and Gaussian Models

TL;DR

MixLight tackles the challenge of estimating HDR scene illumination from a single limited-FOV image by jointly leveraging SH for low-frequency ambient lighting and SG for high-frequency light sources, augmented with a novel SLSparsemax sparsity mechanism. This combination addresses the limitations of purely SH or SG representations and mitigates over-smoothing or over-sparsification common in prior methods. Through comprehensive experiments on the Laval Indoor HDR dataset and a diverse Web Dataset, MixLight demonstrates superior accuracy (RMSE and si-RMSE) and better generalization, underscoring the value of a sparsity-aware, frequency-sweeping parametric illumination model for realistic rendering in mixed reality. The approach offers a practical, efficient alternative to high-dimensional illumination maps, with strong potential for indoor applications and future extension to outdoor and spatially-varying illumination scenarios.

Abstract

Accurately estimating scene lighting is critical for applications such as mixed reality. Existing works estimate illumination by generating illumination maps or regressing illumination parameters. However, the method of generating illumination maps has poor generalization performance and parametric models such as Spherical Harmonic (SH) and Spherical Gaussian (SG) fall short in capturing high-frequency or low-frequency components. This paper presents MixLight, a joint model that utilizes the complementary characteristics of SH and SG to achieve a more complete illumination representation, which uses SH and SG to capture low-frequency ambient and high-frequency light sources respectively. In addition, a special spherical light source sparsemax (SLSparsemax) module that refers to the position and brightness relationship between spherical light sources is designed to improve their sparsity, which is significant but omitted by prior works. Extensive experiments demonstrate that MixLight surpasses state-of-the-art (SOTA) methods on multiple metrics. In addition, experiments on Web Dataset also show that MixLight as a parametric method has better generalization performance than non-parametric methods.
Paper Structure (19 sections, 11 equations, 11 figures, 3 tables, 1 algorithm)

This paper contains 19 sections, 11 equations, 11 figures, 3 tables, 1 algorithm.

Figures (11)

  • Figure 1: Two violin figures illustrate the distinct advantages of SH and SG in illumination estimation tasks. The objective is to examine the potential disparities between SH and SG functions in representing various light components. In the first scenario, SH and SG with the same parameter sizes are used to represent ambient light. Two networks are trained to predict the parameters of SH and SG respectively. The predicted ambient light is then employed to render spheres (depicted in \ref{['fig:speres']}) on the test set. Prediction accuracy is evaluated by calculating the error between the predicted rendering result and the real rendering image. The resulting errors from all test set samples are visualized in the violin figure \ref{['fig:short-ambient']}, clearly displaying the distribution and average of the errors. In the second scenario, SH and SG with similar parameters are used to represent the light source. The network is trained, and the rendering error is then plotted in \ref{['fig:short-light']}. For further details about the experiment's design, refer to the supplementary file.
  • Figure 2: The proposed MixLight estimates illumination and re-illuminates multiple virtual objects (in the second row). MixLight estimates low-dimensional lighting parameters that can be visualized as illumination maps (at the top right of each example) from limited field-of-view pictures (at the top left of each example).
  • Figure 3: MixLight parameters decomposition and estimation. In the right half of the figure, an illumination map is separated as the light sources and the ambient light component, then decomposed to true values of MixLight parameters (including SG and SH parameters). The left half describes the parameter regression process. Specifically, the MixLight model uses an SLSparsemax activation function layer in the network to enforce the sparsity of light source parameters. Consistent with zhan2021emlightzhan2022gmlightgaron2019fastgardner2019deep, MixLight utilizes DenseNet121 as the backbone network. The input of the network is cropped from the illumination map (corresponding to the area in the red box).
  • Figure 4: The scenes used in evaluations consist of three spheres with different materials including diffuse gray, matte silver and mirror silver.
  • Figure 5: Visual comparison of the predicted results. In the first column are limited FOV images that serve as the input for all illumination estimation methods, followed by the predicted results visualized as illumination maps. Note that the first two rows are test samples from the Laval Indoor HDR Dataset, while the last two rows from the Web Dataset.
  • ...and 6 more figures