Pygmalion Effect in Vision: Image-to-Clay Translation for Reflective Geometry Reconstruction
Gayoung Lee, Junho Kim, Jin-Hwa Kim, Junmo Kim
TL;DR
This work tackles the difficulty of recovering 3D geometry from scenes with strong view-dependent reflections. It introduces the Pygmalion Effect in Vision, a dual-branch framework that combines a BRDF-based reflective branch with a clay-guided branch that uses image-to-clay translations to produce reflection-free supervision for geometry learning. A key contribution is the clay-guided reflective Gaussian splatting, where a per-Gaussian clay color and a clay-rendering loss guide stable normal estimation and geometry while the BRDF branch handles appearance; training employs a staged schedule that emphasizes clay supervision early and RGB supervision later. Experiments on synthetic and real datasets show consistent gains in mesh quality and reconstruction stability, demonstrating that translating radiance into neutral clay renders can yield a strong inductive bias for reflective object geometry learning and practical improvements for downstream tasks like relighting and editing.
Abstract
Understanding reflection remains a long-standing challenge in 3D reconstruction due to the entanglement of appearance and geometry under view-dependent reflections. In this work, we present the Pygmalion Effect in Vision, a novel framework that metaphorically "sculpts" reflective objects into clay-like forms through image-to-clay translation. Inspired by the myth of Pygmalion, our method learns to suppress specular cues while preserving intrinsic geometric consistency, enabling robust reconstruction from multi-view images containing complex reflections. Specifically, we introduce a dual-branch network in which a BRDF-based reflective branch is complemented by a clay-guided branch that stabilizes geometry and refines surface normals. The two branches are trained jointly using the synthesized clay-like images, which provide a neutral, reflection-free supervision signal that complements the reflective views. Experiments on both synthetic and real datasets demonstrate substantial improvement in normal accuracy and mesh completeness over existing reflection-handling methods. Beyond technical gains, our framework reveals that seeing by unshining, translating radiance into neutrality, can serve as a powerful inductive bias for reflective object geometry learning.
