Harmonizing Attention: Training-free Texture-aware Geometry Transfer
Eito Ikuta, Yohan Lee, Akihiro Iohara, Yu Saito, Toshiyuki Tanaka
TL;DR
Harmonizing Attention tackles geometry transfer across materials without model training by modifying diffusion-model self-attention to reference multiple references. Texture-aligning Attention during inversion and Geometry-preserving Attention during generation enable decoupling geometry from material texture while maintaining texture continuity, all without fine-tuning. The approach demonstrates superior geometry fidelity and perceptual harmony in both qualitative and quantitative evaluations against strong baselines, including lightweight user studies. This training-free framework broadens practical image compositing applications such as augmented reality and advanced image editing by providing robust, geometry-first harmonization with minimal dataset requirements.
Abstract
Extracting geometry features from photographic images independently of surface texture and transferring them onto different materials remains a complex challenge. In this study, we introduce Harmonizing Attention, a novel training-free approach that leverages diffusion models for texture-aware geometry transfer. Our method employs a simple yet effective modification of self-attention layers, allowing the model to query information from multiple reference images within these layers. This mechanism is seamlessly integrated into the inversion process as Texture-aligning Attention and into the generation process as Geometry-aligning Attention. This dual-attention approach ensures the effective capture and transfer of material-independent geometry features while maintaining material-specific textural continuity, all without the need for model fine-tuning.
