PromptSplit: Revealing Prompt-Level Disagreement in Generative Models

Mehdi Lotfian; Mohammad Jalali; Farzan Farnia

PromptSplit: Revealing Prompt-Level Disagreement in Generative Models

Mehdi Lotfian, Mohammad Jalali, Farzan Farnia

TL;DR

PromptSplit frames prompt-conditioned model comparison as a joint prompt–output spectral problem, constructing tensor-product embeddings and a joint kernel covariance difference $\widehat{\Lambda}_{X,Y|T}=\widehat{C}_{T\otimes X}-\eta\widehat{C}_{T\otimes Y}$ to reveal prompt-dependent disagreements. It introduces a scalable random-projection approach with complexity $\mathcal{O}((m+n)r^2+r^3)$ and proves an $O(1/r^2)$ bound on eigenspace deviation, enabling efficient analysis on large datasets. The method is validated on synthetic controls, text-to-image pipelines (e.g., SDXL, PixArt-$\Sigma$, Kandinsky), and LLMs (NQ-Open), uncovering interpretable prompt clusters where outputs diverge in style, content, or alignment. PromptSplit also demonstrates practical utility by guiding diffusion-model distribution matching and providing prompt-level disagreement maps that complement aggregate fidelity metrics. Overall, the work offers a principled, scalable toolkit for diagnosing where and how prompt-conditioned generative models disagree, with broad applicability across vision and language modalities.

Abstract

Prompt-guided generative AI models have rapidly expanded across vision and language domains, producing realistic and diverse outputs from textual inputs. The growing variety of such models, trained with different data and architectures, calls for principled methods to identify which types of prompts lead to distinct model behaviors. In this work, we propose PromptSplit, a kernel-based framework for detecting and analyzing prompt-dependent disagreement between generative models. For each compared model pair, PromptSplit constructs a joint prompt--output representation by forming tensor-product embeddings of the prompt and image (or text) features, and then computes the corresponding kernel covariance matrix. We utilize the eigenspace of the weighted difference between these matrices to identify the main directions of behavioral difference across prompts. To ensure scalability, we employ a random-projection approximation that reduces computational complexity to $O(nr^2 + r^3)$ for projection dimension $r$. We further provide a theoretical analysis showing that this approximation yields an eigenstructure estimate whose expected deviation from the full-dimensional result is bounded by $O(1/r^2)$. Experiments across text-to-image, text-to-text, and image-captioning settings demonstrate that PromptSplit accurately detects ground-truth behavioral differences and isolates the prompts responsible, offering an interpretable tool for detecting where generative models disagree.

PromptSplit: Revealing Prompt-Level Disagreement in Generative Models

TL;DR

PromptSplit frames prompt-conditioned model comparison as a joint prompt–output spectral problem, constructing tensor-product embeddings and a joint kernel covariance difference

to reveal prompt-dependent disagreements. It introduces a scalable random-projection approach with complexity

and proves an

bound on eigenspace deviation, enabling efficient analysis on large datasets. The method is validated on synthetic controls, text-to-image pipelines (e.g., SDXL, PixArt-

, Kandinsky), and LLMs (NQ-Open), uncovering interpretable prompt clusters where outputs diverge in style, content, or alignment. PromptSplit also demonstrates practical utility by guiding diffusion-model distribution matching and providing prompt-level disagreement maps that complement aggregate fidelity metrics. Overall, the work offers a principled, scalable toolkit for diagnosing where and how prompt-conditioned generative models disagree, with broad applicability across vision and language modalities.

Abstract

for projection dimension

. We further provide a theoretical analysis showing that this approximation yields an eigenstructure estimate whose expected deviation from the full-dimensional result is bounded by

. Experiments across text-to-image, text-to-text, and image-captioning settings demonstrate that PromptSplit accurately detects ground-truth behavioral differences and isolates the prompts responsible, offering an interpretable tool for detecting where generative models disagree.

Paper Structure (26 sections, 2 theorems, 44 equations, 21 figures, 1 table, 2 algorithms)

This paper contains 26 sections, 2 theorems, 44 equations, 21 figures, 1 table, 2 algorithms.

Related Works
Preliminaries
Kernel Matrices and Covariance Operators
Tensor-Product Kernels and Hadamard Joint Kernel Matrices
Method
PromptSplit via Joint Kernel Covariance Difference
Random Projection for Scalable PromptSplit
PromptSplit Guidance for Text-Guided Diffusion Models
Numerical Results
Validation of PromptSplit in settings with known groundtruth
Application of PromptSplit to T2I Models
Application of PromptSplit to LLMs
PromptSplit Guidance for Distribution Matching in LDMs.
Ablation Studies
Conclusion
...and 11 more sections

Key Result

Proposition 1

The matrices $\widehat{\Lambda}_{X,Y|T}$ and $K_{X,\eta Y|T}$ share the same non-zero eigenvalues. Also, for every $K_{X,\eta Y|T}$'s eigenvector $u=[u_{1:n};u_{(n+1):(n+m)}]$ of $K_{X,\eta Y|T}$ with non-zero eigenvalue $\lambda$, then the following $v$ is the eigenvector of $\widehat{\Lambda}_{X,Y

Figures (21)

Figure 1: Overview of PromptSplit for discovering different types of (prompt,answer) between two models. (a) From NQ-Open questions, we generate outputs from the test model (Qwen3) and reference model (Gemma3). (b) Two high-scoring modes found by PromptSplit: for each mode we show representative prompts and the corresponding model outputs.
Figure 2: PromptSplit identified top clusters of prompts with distinct images. Top: 20 clusters of prompts with sample images for test and reference dataset. Bottom: Top 16 eigenvalues showing top 10 disagreement causing prompts.
Figure 3: PromptSplit uncovers occupation-based divergences between PixArt-$\Sigma$ (test model) and SDXL (reference model). (Top middle) Largest eigenvalues barplot highlighting top nine distinct clusters. (Top right) t-SNE projection of images embeddings, revealing occupational clusters. (Bottom) Representative generated images for different $\lambda$ values.
Figure 4: Qualitative comparison of reference set and PS-guided image generation with SDXL.
Figure 5: PromptSplit detected style and scene disagreements in a controlled text-to-image setting. (Top) Sample prompts, generated outputs, and bar plot of top 16 eigenvalues. (Bottom) Strongest samples for top identified distinct modes.
...and 16 more figures

Theorems & Definitions (5)

Definition 1
Proposition 1
proof
Theorem 1
proof

PromptSplit: Revealing Prompt-Level Disagreement in Generative Models

TL;DR

Abstract

PromptSplit: Revealing Prompt-Level Disagreement in Generative Models

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (21)

Theorems & Definitions (5)