Table of Contents
Fetching ...

Vipera: Towards systematic auditing of generative text-to-image models at scale

Yanwei Huang, Wesley Hanwen Deng, Sijia Xiao, Motahhare Eslami, Jason I. Hong, Adam Perer

TL;DR

Vipera tackles the scalability challenge of auditing generative text-to-image outputs by combining a scene-graph visual cue with LLM-powered suggestions to guide structured, multi-faceted analysis. The system supports intuitive sensemaking through prompts, image collections, and labeling, augmented by distribution visualizations and AI-driven prompts. An observational study with five auditors shows Vipera improves breadth and organization of auditing while revealing a need for customization and improved guidance for depth. Overall, Vipera offers a practical pathway toward scalable, human-in-the-loop auditing that promotes responsible and transparent GenAI use in creative applications.

Abstract

Generative text-to-image (T2I) models are known for their risks related such as bias, offense, and misinformation. Current AI auditing methods face challenges in scalability and thoroughness, and it is even more challenging to enable auditors to explore the auditing space in a structural and effective way. Vipera employs multiple visual cues including a scene graph to facilitate image collection sensemaking and inspire auditors to explore and hierarchically organize the auditing criteria. Additionally, it leverages LLM-powered suggestions to facilitate exploration of unexplored auditing directions. An observational user study demonstrates Vipera's effectiveness in helping auditors organize their analyses while engaging with diverse criteria.

Vipera: Towards systematic auditing of generative text-to-image models at scale

TL;DR

Vipera tackles the scalability challenge of auditing generative text-to-image outputs by combining a scene-graph visual cue with LLM-powered suggestions to guide structured, multi-faceted analysis. The system supports intuitive sensemaking through prompts, image collections, and labeling, augmented by distribution visualizations and AI-driven prompts. An observational study with five auditors shows Vipera improves breadth and organization of auditing while revealing a need for customization and improved guidance for depth. Overall, Vipera offers a practical pathway toward scalable, human-in-the-loop auditing that promotes responsible and transparent GenAI use in creative applications.

Abstract

Generative text-to-image (T2I) models are known for their risks related such as bias, offense, and misinformation. Current AI auditing methods face challenges in scalability and thoroughness, and it is even more challenging to enable auditors to explore the auditing space in a structural and effective way. Vipera employs multiple visual cues including a scene graph to facilitate image collection sensemaking and inspire auditors to explore and hierarchically organize the auditing criteria. Additionally, it leverages LLM-powered suggestions to facilitate exploration of unexplored auditing directions. An observational user study demonstrates Vipera's effectiveness in helping auditors organize their analyses while engaging with diverse criteria.

Paper Structure

This paper contains 17 sections, 2 figures.

Figures (2)

  • Figure 1: The ViperaBase prototype used in the formative study, showing the generated images (right) and a scene graph (left) for the user prompt (top). The scene graph is a node-link diagram where nodes indicate objects (or their attributes) within the images and edges indicate the semantic relationships. Bar charts will be shown when hovering on attribute nodes.
  • Figure 2: Vipera's technical pipeline. Black edges indicate the data flow and blue ones indicate the inspiration flow for iterative auditing.