Vipera: Towards systematic auditing of generative text-to-image models at scale
Yanwei Huang, Wesley Hanwen Deng, Sijia Xiao, Motahhare Eslami, Jason I. Hong, Adam Perer
TL;DR
Vipera tackles the scalability challenge of auditing generative text-to-image outputs by combining a scene-graph visual cue with LLM-powered suggestions to guide structured, multi-faceted analysis. The system supports intuitive sensemaking through prompts, image collections, and labeling, augmented by distribution visualizations and AI-driven prompts. An observational study with five auditors shows Vipera improves breadth and organization of auditing while revealing a need for customization and improved guidance for depth. Overall, Vipera offers a practical pathway toward scalable, human-in-the-loop auditing that promotes responsible and transparent GenAI use in creative applications.
Abstract
Generative text-to-image (T2I) models are known for their risks related such as bias, offense, and misinformation. Current AI auditing methods face challenges in scalability and thoroughness, and it is even more challenging to enable auditors to explore the auditing space in a structural and effective way. Vipera employs multiple visual cues including a scene graph to facilitate image collection sensemaking and inspire auditors to explore and hierarchically organize the auditing criteria. Additionally, it leverages LLM-powered suggestions to facilitate exploration of unexplored auditing directions. An observational user study demonstrates Vipera's effectiveness in helping auditors organize their analyses while engaging with diverse criteria.
