Semantic Anchoring for Robust Personalization in Text-to-Image Diffusion Models

Seoyun Yang; Gihoon Kim; Taesup Kim

Semantic Anchoring for Robust Personalization in Text-to-Image Diffusion Models

Seoyun Yang, Gihoon Kim, Taesup Kim

TL;DR

Personalization in text-to-image diffusion is hindered by semantic drift when learning rare subjects from few references. The authors introduce Semantic Anchoring Personalization, a training-time objective that anchors rare subject learning to the pretrained frequent semantics, effectively blending guidance from both concepts. The method yields consistent improvements in subject fidelity and text-image alignment across multiple backbones and through comprehensive ablations. This anchoring strategy enables stable expansion of the pretrained distribution toward personalized regions with preserved semantic structure, offering robust, generalizable personalization for diffusion models.

Abstract

Text-to-image diffusion models have achieved remarkable progress in generating diverse and realistic images from textual descriptions. However, they still struggle with personalization, which requires adapting a pretrained model to depict user-specific subjects from only a few reference images. The key challenge lies in learning a new visual concept from a limited number of reference images while preserving the pretrained semantic prior that maintains text-image alignment. When the model focuses on subject fidelity, it tends to overfit the limited reference images and fails to leverage the pretrained distribution. Conversely, emphasizing prior preservation maintains semantic consistency but prevents the model from learning new personalized attributes. Building on these observations, we propose the personalization process through a semantic anchoring that guides adaptation by grounding new concepts in their corresponding distributions. We therefore reformulate personalization as the process of learning a rare concept guided by its frequent counterpart through semantic anchoring. This anchoring encourages the model to adapt new concepts in a stable and controlled manner, expanding the pretrained distribution toward personalized regions while preserving its semantic structure. As a result, the proposed method achieves stable adaptation and consistent improvements in both subject fidelity and text-image alignment compared to baseline methods. Extensive experiments and ablation studies further demonstrate the robustness and effectiveness of the proposed anchoring strategy.

Semantic Anchoring for Robust Personalization in Text-to-Image Diffusion Models

TL;DR

Abstract

Semantic Anchoring for Robust Personalization in Text-to-Image Diffusion Models

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (10)