Table of Contents
Fetching ...

Bita: A Conversational Assistant for Fairness Testing

Keeryn Johnson, Cleyton Magalhaes, Ronnie de Souza Santos

TL;DR

Fairness testing is essential but tools are hard to adopt in practice. The authors present Bita, a conversational assistant that couples LLMs with retrieval-augmented generation to ground fairness guidance in curated literature and support bias identification, test-plan evaluation, and exploratory-charter generation. Through validation on two real-world AI systems, Bita demonstrates context-sensitive, reproducible, and workflow-aligned assistance for fairness testing. The work offers a practical, scalable approach to operationalizing fairness testing in industry settings.

Abstract

Bias in AI systems can lead to unfair and discriminatory outcomes, especially when left untested before deployment. Although fairness testing aims to identify and mitigate such bias, existing tools are often difficult to use, requiring advanced expertise and offering limited support for real-world workflows. To address this, we introduce Bita, a conversational assistant designed to help software testers detect potential sources of bias, evaluate test plans through a fairness lens, and generate fairness-oriented exploratory testing charters. Bita integrates a large language model with retrieval-augmented generation, grounding its responses in curated fairness literature. Our validation demonstrates how Bita supports fairness testing tasks on real-world AI systems, providing structured, reproducible evidence of its utility. In summary, our work contributes a practical tool that operationalizes fairness testing in a way that is accessible, systematic, and directly applicable to industrial practice.

Bita: A Conversational Assistant for Fairness Testing

TL;DR

Fairness testing is essential but tools are hard to adopt in practice. The authors present Bita, a conversational assistant that couples LLMs with retrieval-augmented generation to ground fairness guidance in curated literature and support bias identification, test-plan evaluation, and exploratory-charter generation. Through validation on two real-world AI systems, Bita demonstrates context-sensitive, reproducible, and workflow-aligned assistance for fairness testing. The work offers a practical, scalable approach to operationalizing fairness testing in industry settings.

Abstract

Bias in AI systems can lead to unfair and discriminatory outcomes, especially when left untested before deployment. Although fairness testing aims to identify and mitigate such bias, existing tools are often difficult to use, requiring advanced expertise and offering limited support for real-world workflows. To address this, we introduce Bita, a conversational assistant designed to help software testers detect potential sources of bias, evaluate test plans through a fairness lens, and generate fairness-oriented exploratory testing charters. Bita integrates a large language model with retrieval-augmented generation, grounding its responses in curated fairness literature. Our validation demonstrates how Bita supports fairness testing tasks on real-world AI systems, providing structured, reproducible evidence of its utility. In summary, our work contributes a practical tool that operationalizes fairness testing in a way that is accessible, systematic, and directly applicable to industrial practice.

Paper Structure

This paper contains 18 sections, 1 figure, 1 table.

Figures (1)

  • Figure 1: Bita’s Interactive Interface in Use