DINOLight: Robust Ambient Light Normalization with Self-supervised Visual Prior Integration

Youngjin Oh; Junhyeong Kwon; Nam Ik Cho

DINOLight: Robust Ambient Light Normalization with Self-supervised Visual Prior Integration

Youngjin Oh, Junhyeong Kwon, Nam Ik Cho

Abstract

This paper presents a new ambient light normalization framework, DINOLight, that integrates the self-supervised model DINOv2's image understanding capability into the restoration process as a visual prior. Ambient light normalization aims to restore images degraded by non-uniform shadows and lighting caused by multiple light sources and complex scene geometries. We observe that DINOv2 can reliably extract both semantic and geometric information from a degraded image. Based on this observation, we develop a novel framework to utilize DINOv2 features for lighting normalization. First, we propose an adaptive feature fusion module that combines features from different DINOv2 layers using a point-wise softmax mask. Next, the fused features are integrated into our proposed restoration network in both spatial and frequency domains through an auxiliary cross-attention mechanism. Experiments show that DINOLight achieves superior performance on the Ambient6K dataset, and that DINOv2 features are effective for enhancing ambient light normalization. We also apply our method to shadow-removal benchmark datasets, achieving competitive results compared to methods that use mask priors. Codes will be released upon acceptance.

DINOLight: Robust Ambient Light Normalization with Self-supervised Visual Prior Integration

Abstract

Paper Structure (13 sections, 3 equations, 3 figures, 4 tables)

This paper contains 13 sections, 3 equations, 3 figures, 4 tables.

Introduction
Related Work
Shadow Removal
Ambient Light Normalization
DINOv2 on Low-Level Vision Tasks
Method
DINOv2 Feature Extraction and Fusion
DINOv2 Feature-Integrated ALN
Experiments
Results
Ablation Studies
Application to Shadow Removal
Conclusion

Figures (3)

Figure 1: Comparison of PCA-computed DINOv2 features of pairs of images from the Ambient6K dataset, which have the same content under different lighting conditions. We observe that DINOv2 features contain degradation-dependent and -independent information that varies with layer depth, motivating us to use them for restoration. More examples can be found in the Supplementary Material.
Figure 2: Overview of DINOLight, and the two core elements: Adaptive Feature Fusion Module (AFFM), and Auxiliary Cross-Attention (ACA) of the proposed SFDINO block. (a) The first stage involves extracting features from various layers of DINOv2 and combining them using AFFM. (b) In the second stage, the fused features are integrated into the ALN process at multiple resolutions using ACA in SFDINO blocks.
Figure 3: Qualitative comparison on Ambient6K test set. From left to right: input, restoration results on a versatile restoration network NAFNet chen2022simple, previous ALN methods IFBlend vasluianu2024towards and PromptNorm serrano2025promptnorm, ours, and ground truth. Red boxes highlight regions where DINOLight effectively normalizes light in comparison.

DINOLight: Robust Ambient Light Normalization with Self-supervised Visual Prior Integration

Abstract

DINOLight: Robust Ambient Light Normalization with Self-supervised Visual Prior Integration

Authors

Abstract

Table of Contents

Figures (3)