Table of Contents
Fetching ...

Not All Latent Spaces Are Flat: Hyperbolic Concept Control

Maria Rosaria Briglia, Simone Facchiano, Paolo Cursi, Alessio Sampieri, Emanuele Rodolà, Guido Maria D'Amely di Melendugno, Luca Franco, Fabio Galasso, Iacopo Masi

Abstract

As modern text-to-image (T2I) models draw closer to synthesizing highly realistic content, the threat of unsafe content generation grows, and it becomes paramount to exercise control. Existing approaches steer these models by applying Euclidean adjustments to text embeddings, redirecting the generation away from unsafe concepts. In this work, we introduce hyperbolic control (HyCon): a novel control mechanism based on parallel transport that leverages semantically aligned hyperbolic representation space to yield more expressive and stable manipulation of concepts. HyCon reuses off-the-shelf generative models and a state-of-the-art hyperbolic text encoder, linked via a lightweight adapter. HyCon achieves state-of-the-art results across four safety benchmarks and four T2I backbones, showing that hyperbolic steering is a practical and flexible approach for more reliable T2I generation.

Not All Latent Spaces Are Flat: Hyperbolic Concept Control

Abstract

As modern text-to-image (T2I) models draw closer to synthesizing highly realistic content, the threat of unsafe content generation grows, and it becomes paramount to exercise control. Existing approaches steer these models by applying Euclidean adjustments to text embeddings, redirecting the generation away from unsafe concepts. In this work, we introduce hyperbolic control (HyCon): a novel control mechanism based on parallel transport that leverages semantically aligned hyperbolic representation space to yield more expressive and stable manipulation of concepts. HyCon reuses off-the-shelf generative models and a state-of-the-art hyperbolic text encoder, linked via a lightweight adapter. HyCon achieves state-of-the-art results across four safety benchmarks and four T2I backbones, showing that hyperbolic steering is a practical and flexible approach for more reliable T2I generation.
Paper Structure (24 sections, 18 equations, 13 figures, 4 tables)

This paper contains 24 sections, 18 equations, 13 figures, 4 tables.

Figures (13)

  • Figure 1: (top) In hyperbolic space, concepts (e.g., man or coffee) form entailment cones, and concepts' composition corresponds to the cones' intersection. To edit a prompt embedding (e.g., adding coffee to man), we steer it toward the corresponding intersection. (bottom) HyCon leverages this hyperbolic geometric structure to add or remove concepts via geometry-consistent edits.
  • Figure 2: (a) On the COCO training set, we demonstrate that the HyCoCLIP structure effectively maps concept embeddings and their composites into the correct entailment cones, see the discussion in Section \ref{['sec:motivation']}. (b) Euclidean (top) vs. HyCon (bottom) behavior as control strength increases with Stable Diffusion 3.5: Euclidean steering leads to non-smooth or incomplete transitions. By contrast, HyCon follows a smooth geodesic trajectory and remains stable for larger $\lambda$, consistently increasing the influence of the steered concept.
  • Figure 3: Semantic alignment distributions of samples retrieved from concept-specific entailment cones. For each concept, the semantic alignment is measured using the CLIPScore of the retrieved embeddings after they are mapped back to their Euclidean representations. Embeddings align more closely with the corresponding concept prompt (blue) than with other concepts (red).
  • Figure 4: Qualitative results on Ring-a-Bell (top) and COCO retain set (bottom). For each dataset, columns show Baseline, SAFREE, and HyCon (left to right). On Ring-a-Bell, both methods suppress the target unsafe concept, while on COCO HyCon better preserves non-target content and overall visual fidelity.
  • Figure 5: Qualitative examples. Top row: removing the “Van Gogh” concept HyCon. As the steering strength increases, the generation remains stable and preserves the intended content. Bottom row: adding the concept “night”. Zoom in for details.
  • ...and 8 more figures