TIBET: Identifying and Evaluating Biases in Text-to-Image Generative Models
Aditya Chinchure, Pushkar Shukla, Gaurav Bhatt, Kiri Salij, Kartik Hosanagar, Leonid Sigal, Matthew Turk
TL;DR
TIBET introduces a dynamic, prompt-dependent framework for identifying, quantifying, and explaining biases in Text-to-Image (TTI) generation. By leveraging LLMs to generate bias axes and counterfactual prompts, generating image sets with a black-box TTI model, and evaluating bias with the Concept Association Score ($CAS$) and Mean Absolute Deviation ($MAD$), the method provides both quantitative and post-hoc qualitative explanations. It supports two image-comparison strategies: a VQA-based concept extraction approach and a CLIP embedding method, enabling flexible bias analysis across prompts and axes. The paper demonstrates applicability to gender stereotypes in occupations, examines robustness to VQA errors, and shows potential for bias mitigation when combined with ITI-GEN, along with human studies validating the approach and discussing limitations and ethical considerations.
Abstract
Text-to-Image (TTI) generative models have shown great progress in the past few years in terms of their ability to generate complex and high-quality imagery. At the same time, these models have been shown to suffer from harmful biases, including exaggerated societal biases (e.g., gender, ethnicity), as well as incidental correlations that limit such a model's ability to generate more diverse imagery. In this paper, we propose a general approach to study and quantify a broad spectrum of biases, for any TTI model and for any prompt, using counterfactual reasoning. Unlike other works that evaluate generated images on a predefined set of bias axes, our approach automatically identifies potential biases that might be relevant to the given prompt, and measures those biases. In addition, we complement quantitative scores with post-hoc explanations in terms of semantic concepts in the images generated. We show that our method is uniquely capable of explaining complex multi-dimensional biases through semantic concepts, as well as the intersectionality between different biases for any given prompt. We perform extensive user studies to illustrate that the results of our method and analysis are consistent with human judgements.
