Table of Contents
Fetching ...

From Literature to Practice: Exploring Fairness Testing Tools for the Software Industry Adoption

Thanh Nguyen, Luiz Fernando de Lima, Maria Teresa Badassarre, Ronnie de Souza Santos

TL;DR

The paper investigates the practicality of fairness testing tools for software development by combining document analysis and heuristic evaluation of 41 tools from the literature, narrowing to five that meet pragmatic criteria. It reveals major usability gaps: minimal documentation, lack of user interfaces, limited dataset versatility (mostly binary), and infrequent updates, hindering industry adoption. Scikit-fairness emerges as the most viable option among the five, though improvements in usability and documentation remain needed. The authors argue for industry-oriented fairness tooling that integrates into development workflows and supports early bias mitigation with robust reporting. The work highlights a gap between research proposals and deployable, maintainable tools in real-world software engineering.

Abstract

In today's world, we need to ensure that AI systems are fair and unbiased. Our study looked at tools designed to test the fairness of software to see if they are practical and easy for software developers to use. We found that while some tools are cost-effective and compatible with various programming environments, many are hard to use and lack detailed instructions. They also tend to focus on specific types of data, which limits their usefulness in real-world situations. Overall, current fairness testing tools need significant improvements to better support software developers in creating fair and equitable technology. We suggest that new tools should be user-friendly, well-documented, and flexible enough to handle different kinds of data, helping developers identify and fix biases early in the development process. This will lead to more trustworthy and fair software for everyone.

From Literature to Practice: Exploring Fairness Testing Tools for the Software Industry Adoption

TL;DR

The paper investigates the practicality of fairness testing tools for software development by combining document analysis and heuristic evaluation of 41 tools from the literature, narrowing to five that meet pragmatic criteria. It reveals major usability gaps: minimal documentation, lack of user interfaces, limited dataset versatility (mostly binary), and infrequent updates, hindering industry adoption. Scikit-fairness emerges as the most viable option among the five, though improvements in usability and documentation remain needed. The authors argue for industry-oriented fairness tooling that integrates into development workflows and supports early bias mitigation with robust reporting. The work highlights a gap between research proposals and deployable, maintainable tools in real-world software engineering.

Abstract

In today's world, we need to ensure that AI systems are fair and unbiased. Our study looked at tools designed to test the fairness of software to see if they are practical and easy for software developers to use. We found that while some tools are cost-effective and compatible with various programming environments, many are hard to use and lack detailed instructions. They also tend to focus on specific types of data, which limits their usefulness in real-world situations. Overall, current fairness testing tools need significant improvements to better support software developers in creating fair and equitable technology. We suggest that new tools should be user-friendly, well-documented, and flexible enough to handle different kinds of data, helping developers identify and fix biases early in the development process. This will lead to more trustworthy and fair software for everyone.
Paper Structure (13 sections, 2 tables)