Bita: A Conversational Assistant for Fairness Testing
Keeryn Johnson, Cleyton Magalhaes, Ronnie de Souza Santos
TL;DR
Fairness testing is essential but tools are hard to adopt in practice. The authors present Bita, a conversational assistant that couples LLMs with retrieval-augmented generation to ground fairness guidance in curated literature and support bias identification, test-plan evaluation, and exploratory-charter generation. Through validation on two real-world AI systems, Bita demonstrates context-sensitive, reproducible, and workflow-aligned assistance for fairness testing. The work offers a practical, scalable approach to operationalizing fairness testing in industry settings.
Abstract
Bias in AI systems can lead to unfair and discriminatory outcomes, especially when left untested before deployment. Although fairness testing aims to identify and mitigate such bias, existing tools are often difficult to use, requiring advanced expertise and offering limited support for real-world workflows. To address this, we introduce Bita, a conversational assistant designed to help software testers detect potential sources of bias, evaluate test plans through a fairness lens, and generate fairness-oriented exploratory testing charters. Bita integrates a large language model with retrieval-augmented generation, grounding its responses in curated fairness literature. Our validation demonstrates how Bita supports fairness testing tasks on real-world AI systems, providing structured, reproducible evidence of its utility. In summary, our work contributes a practical tool that operationalizes fairness testing in a way that is accessible, systematic, and directly applicable to industrial practice.
