How fair are we? From conceptualization to automated assessment of fairness definitions

Giordano d'Aloisio; Claudio Di Sipio; Antinisca Di Marco; Davide Di Ruscio

How fair are we? From conceptualization to automated assessment of fairness definitions

Giordano d'Aloisio, Claudio Di Sipio, Antinisca Di Marco, Davide Di Ruscio

TL;DR

The paper tackles the challenge of rigid, predefined fairness definitions in ML-enabled software by introducing MODNESS, a model-driven framework with a two-layer metamodel and a domain-specific language that lets users conceptualize and implement custom fairness concepts. It automates the entire fairness assessment workflow, from bias definition to code-generation of fairness tests, enabling domain-agnostic analysis across university admissions, recommender systems, and IoT datasets. Through a lightweight literature review and comparative evaluation against MDE baselines, MODNESS demonstrates greater expressiveness (supporting both group and individual biases, domain bias definitions, and custom metrics) while maintaining automation. The work highlights practical impact by enabling tailored fairness evaluations across diverse domains, with an extensible generation pipeline and open replication assets for broader adoption and future work.

Abstract

Fairness is a critical concept in ethics and social domains, but it is also a challenging property to engineer in software systems. With the increasing use of machine learning in software systems, researchers have been developing techniques to automatically assess the fairness of software systems. Nonetheless, a significant proportion of these techniques rely upon pre-established fairness definitions, metrics, and criteria, which may fail to encompass the wide-ranging needs and preferences of users and stakeholders. To overcome this limitation, we propose a novel approach, called MODNESS, that enables users to customize and define their fairness concepts using a dedicated modeling environment. Our approach guides the user through the definition of new fairness concepts also in emerging domains, and the specification and composition of metrics for its evaluation. Ultimately, MODNESS generates the source code to implement fair assessment based on these custom definitions. In addition, we elucidate the process we followed to collect and analyze relevant literature on fairness assessment in software engineering (SE). We compare MODNESS with the selected approaches and evaluate how they support the distinguishing features identified by our study. Our findings reveal that i) most of the current approaches do not support user-defined fairness concepts; ii) our approach can cover two additional application domains not addressed by currently available tools, i.e., mitigating bias in recommender systems for software engineering and Arduino software component recommendations; iii) MODNESS demonstrates the capability to overcome the limitations of the only two other Model-Driven Engineering-based approaches for fairness assessment.

How fair are we? From conceptualization to automated assessment of fairness definitions

TL;DR

Abstract

How fair are we? From conceptualization to automated assessment of fairness definitions

Authors

TL;DR

Abstract

Table of Contents

Figures (6)