Table of Contents
Fetching ...

ToMigo: Interpretable Design Concept Graphs for Aligning Generative AI with Creative Intent

Lena Hegemann, Xinyi Wen, Michael A. Hedderich, Tarmo Nurmi, Hariharan Subramonyam

TL;DR

ToMigo addresses the misalignment between user intent and generative AI outputs by introducing interpretable design concept graphs that encode both explicit and inferred design intent. It combines multimodal input analysis (reference images and briefs) with a four-role graph schema and reasoning-enabled edges to maintain coherence and support iterative refinement. The work introduces three interactions—theory-of-mind widgets, clarifying questions, and graph-guided design generation—that surface uncertainties, enable direct editing of AI reasoning, and realign outputs with updated concepts. Two user studies demonstrate high alignment between user intentions and the graph representations, and show that ToMigo enhances grounding and design exploration, offering practical benefits for novice designers and collaborative design workflows.

Abstract

Generative AI often produces results misaligned with user intentions, for example, resolving ambiguous prompts in unexpected ways. Despite existing approaches to clarify intent, a major challenge remains: understanding and influencing AI's interpretation of user intent through simple, direct inputs requiring no expertise or rigid procedures. We present ToMigo, representing intent as design concept graphs: nodes represent choices of purpose, content, or style, while edges link them with interpretable explanations. Applied to graphic design, ToMigo infers intent from reference images and text. We derived a schema of node types and edges from pre-study data, informing a multimodal large language model to generate graphs aligning nodes externally with user intent and internally toward a unified design goal. This structure enables users to explore AI reasoning and directly manipulate the design concept. In our user studies, ToMigo received high alignment ratings and captured most user intentions well. Users reported greater control and found interactive features-editable graphs, reflective chats, concept-design realignment-useful for evolving and realizing their design ideas.

ToMigo: Interpretable Design Concept Graphs for Aligning Generative AI with Creative Intent

TL;DR

ToMigo addresses the misalignment between user intent and generative AI outputs by introducing interpretable design concept graphs that encode both explicit and inferred design intent. It combines multimodal input analysis (reference images and briefs) with a four-role graph schema and reasoning-enabled edges to maintain coherence and support iterative refinement. The work introduces three interactions—theory-of-mind widgets, clarifying questions, and graph-guided design generation—that surface uncertainties, enable direct editing of AI reasoning, and realign outputs with updated concepts. Two user studies demonstrate high alignment between user intentions and the graph representations, and show that ToMigo enhances grounding and design exploration, offering practical benefits for novice designers and collaborative design workflows.

Abstract

Generative AI often produces results misaligned with user intentions, for example, resolving ambiguous prompts in unexpected ways. Despite existing approaches to clarify intent, a major challenge remains: understanding and influencing AI's interpretation of user intent through simple, direct inputs requiring no expertise or rigid procedures. We present ToMigo, representing intent as design concept graphs: nodes represent choices of purpose, content, or style, while edges link them with interpretable explanations. Applied to graphic design, ToMigo infers intent from reference images and text. We derived a schema of node types and edges from pre-study data, informing a multimodal large language model to generate graphs aligning nodes externally with user intent and internally toward a unified design goal. This structure enables users to explore AI reasoning and directly manipulate the design concept. In our user studies, ToMigo received high alignment ratings and captured most user intentions well. Users reported greater control and found interactive features-editable graphs, reflective chats, concept-design realignment-useful for evolving and realizing their design ideas.
Paper Structure (48 sections, 9 figures, 2 tables)

This paper contains 48 sections, 9 figures, 2 tables.

Figures (9)

  • Figure 1: Predicting user intentions correctly is a communication task where a user conveys their vision (left) incompletely with simple text and image input (middle) from which the system needs to reconstruct it. The model of is expected to correctly represent features that are explicitly mentioned but to reconstruct the vision fully, it will need to identify implied features (for example repeated or salient features in the images) and reason about choices for features that are less clearly indicated in the input but required to realize the explicit and implied features.
  • Figure 2: The graph schema organizes node types (orange) into four primary roles (purple). Lines indicate relationships found in our dataset where participants mentioned aspects in relation to each other. More saturated orange indicates a higher frequency in the dataset. Connections typically flow from right to left as low level feature decisions support choices that concern the design as a whole.
  • Figure 3: Example of a design concept graph. This concept graph instantiates the relevant node types with concrete descriptions of design decisions for the magician book cover. Edges are labeled with reasons why the source node supports the target node. All edges have reason labels. Due to space constraints the figure does not show all of them.
  • Figure 4: The study procedure consisted of four phases: The pre-task, design definition, rating, and post-task phases. The design phase and the rating phase were repeated three times per participant, resulting in three rated design ideas per participant.
  • Figure 5: In the rating phase, participants first rated the generated designs for alignment overall and per key-feature defined by them in the design definition phase. After completing these ratings for both designs, participants proceeded to rate the design concept graphs. Given the nodes as a list, as for the designs they rated the graph's overall and per-feature alignment with the idea. Finally, they rated the alignment of each node with their idea.
  • ...and 4 more figures