Table of Contents
Fetching ...

Clair Obscur: an Illumination-Aware Method for Real-World Image Vectorization

Xingyue Lin, Shuai Peng, Xiangyu Xie, Jianhua Zhu, Yuxuan Zhou, Liangcai Gao

TL;DR

The paper addresses real-world image vectorization by reducing fragmentation and enhancing editability through a Clair-Obscur-inspired, intrinsic decomposition in the vector domain. It introduces COVec, which represents an image as a unified vector composition $\mathcal{V}_{final} = \mathcal{V}_{A} * \mathcal{V}_{I}$, where $\mathcal{V}_{A}$ captures intrinsic albedo and $\mathcal{V}_{I}$ encodes shading and illumination, subsequently decomposed into $\mathcal{V}_{S}$ (shade) and $\mathcal{V}_{L}$ (light). The method initializes layers via region-wise semantic binarization and SAM-based albedo masks, then uses a two-stage differentiable rendering optimization with a structure loss and a reconstruction loss, followed by illumination refinement and layer separation. Key contributions include the first integration of intrinsic image decomposition into vector graphics, a region-wise initialization strategy for illumination modeling, and demonstrated improvements in visual fidelity and editability over state-of-the-art baselines. This work enables high-fidelity, editable vector representations for real-world imagery, with practical impact on scalable, resolution-independent editing and rendering of complex scenes.

Abstract

Image vectorization aims to convert raster images into editable, scalable vector representations while preserving visual fidelity. Existing vectorization methods struggle to represent complex real-world images, often producing fragmented shapes at the cost of semantic conciseness. In this paper, we propose COVec, an illumination-aware vectorization framework inspired by the Clair-Obscur principle of light-shade contrast. COVec is the first to introduce intrinsic image decomposition in the vector domain, separating an image into albedo, shade, and light layers in a unified vector representation. A semantic-guided initialization and two-stage optimization refine these layers with differentiable rendering. Experiments on various datasets demonstrate that COVec achieves higher visual fidelity and significantly improved editability compared to existing methods.

Clair Obscur: an Illumination-Aware Method for Real-World Image Vectorization

TL;DR

The paper addresses real-world image vectorization by reducing fragmentation and enhancing editability through a Clair-Obscur-inspired, intrinsic decomposition in the vector domain. It introduces COVec, which represents an image as a unified vector composition , where captures intrinsic albedo and encodes shading and illumination, subsequently decomposed into (shade) and (light). The method initializes layers via region-wise semantic binarization and SAM-based albedo masks, then uses a two-stage differentiable rendering optimization with a structure loss and a reconstruction loss, followed by illumination refinement and layer separation. Key contributions include the first integration of intrinsic image decomposition into vector graphics, a region-wise initialization strategy for illumination modeling, and demonstrated improvements in visual fidelity and editability over state-of-the-art baselines. This work enables high-fidelity, editable vector representations for real-world imagery, with practical impact on scalable, resolution-independent editing and rendering of complex scenes.

Abstract

Image vectorization aims to convert raster images into editable, scalable vector representations while preserving visual fidelity. Existing vectorization methods struggle to represent complex real-world images, often producing fragmented shapes at the cost of semantic conciseness. In this paper, we propose COVec, an illumination-aware vectorization framework inspired by the Clair-Obscur principle of light-shade contrast. COVec is the first to introduce intrinsic image decomposition in the vector domain, separating an image into albedo, shade, and light layers in a unified vector representation. A semantic-guided initialization and two-stage optimization refine these layers with differentiable rendering. Experiments on various datasets demonstrate that COVec achieves higher visual fidelity and significantly improved editability compared to existing methods.

Paper Structure

This paper contains 22 sections, 27 equations, 19 figures.

Figures (19)

  • Figure 1: Layer-wise rendering and editing results of our method. The top rows show the progressive composition from albedo, shade, and light layers, revealing how illumination enhances depth and realism. The bottom row demonstrates layer-level editability: modifying only the albedo layer while preserving illumination, enabling flexible edits without disrupting the original light–shade structure.
  • Figure 2: The albedo, shade, and light layers of an image.
  • Figure 3: The principle of Clair-Obscur in art. Classical painting and modern animation use tone variations within the same semantic regions (e.g. skin, hair) to convey light–shade structure.
  • Figure 4: Pipeline of COVec. We initialize albedo and illumination layers with distinct strategies to obtain their respective masks. The first stage jointly optimizes both layers via differentiable rendering, where $\mathcal{L}_{\text{struct}}$ enforces structural consistency with segemented masks and $\mathcal{L}_{\text{recon}}$ minimizes the reconstruction error between the blended result ( \ref{['eq:render_equation']}) and the target image. The second stage refines illumination details with $\mathcal{L}_{\text{refine}}$. Finally, the illumination layer is separated into shade and light components to produce a unified SVG output.
  • Figure 5: Illustration of the region-wise semantic binarization process. The target image is binarized within each semantic mask, yielding illumination masks aligned with object boundaries.
  • ...and 14 more figures