SP-Guard: Selective Prompt-adaptive Guidance for Safe Text-to-Image Generation

Sumin Yu; Taesup Moon

SP-Guard: Selective Prompt-adaptive Guidance for Safe Text-to-Image Generation

Sumin Yu, Taesup Moon

TL;DR

SP-Guard tackles unsafe content generation in text-to-image diffusion by introducing prompt-adaptive and region-selective guidance. It estimates prompt harmfulness via cosine similarity between the prompt-induced noise direction and unsafe concept directions and applies a per-timestep mask to suppress only the unsafe regions, preserving benign content. Empirical results across four unsafe datasets show SP-Guard delivers strong safety gains with minimal image fidelity loss, outperforming prior inference-time methods in content preservation and controllability. The approach enhances trustworthiness of generative AI and offers a practical path toward adaptable safety in multimodal systems and beyond.

Abstract

While diffusion-based T2I models have achieved remarkable image generation quality, they also enable easy creation of harmful content, raising social concerns and highlighting the need for safer generation. Existing inference-time guiding methods lack both adaptivity--adjusting guidance strength based on the prompt--and selectivity--targeting only unsafe regions of the image. Our method, SP-Guard, addresses these limitations by estimating prompt harmfulness and applying a selective guidance mask to guide only unsafe areas. Experiments show that SP-Guard generates safer images than existing methods while minimizing unintended content alteration. Beyond improving safety, our findings highlight the importance of transparency and controllability in image generation.

SP-Guard: Selective Prompt-adaptive Guidance for Safe Text-to-Image Generation

TL;DR

Abstract

SP-Guard: Selective Prompt-adaptive Guidance for Safe Text-to-Image Generation

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (5)