Table of Contents
Fetching ...

Enhancing Model Fairness and Accuracy with Similarity Networks: A Methodological Approach

Samira Maghool, Paolo Ceravolo

TL;DR

An innovative approach to thoroughly explore dataset features that introduce bias in downstream machine-learning tasks, which performs well in imputation and augmentation of the dataset satisfying the fairness criteria such as demographic parity and imbalanced classes.

Abstract

In this paper, we propose an innovative approach to thoroughly explore dataset features that introduce bias in downstream machine-learning tasks. Depending on the data format, we use different techniques to map instances into a similarity feature space. Our method's ability to adjust the resolution of pairwise similarity provides clear insights into the relationship between the dataset classification complexity and model fairness. Experimental results confirm the promising applicability of the similarity network in promoting fair models. Moreover, leveraging our methodology not only seems promising in providing a fair downstream task such as classification, it also performs well in imputation and augmentation of the dataset satisfying the fairness criteria such as demographic parity and imbalanced classes.

Enhancing Model Fairness and Accuracy with Similarity Networks: A Methodological Approach

TL;DR

An innovative approach to thoroughly explore dataset features that introduce bias in downstream machine-learning tasks, which performs well in imputation and augmentation of the dataset satisfying the fairness criteria such as demographic parity and imbalanced classes.

Abstract

In this paper, we propose an innovative approach to thoroughly explore dataset features that introduce bias in downstream machine-learning tasks. Depending on the data format, we use different techniques to map instances into a similarity feature space. Our method's ability to adjust the resolution of pairwise similarity provides clear insights into the relationship between the dataset classification complexity and model fairness. Experimental results confirm the promising applicability of the similarity network in promoting fair models. Moreover, leveraging our methodology not only seems promising in providing a fair downstream task such as classification, it also performs well in imputation and augmentation of the dataset satisfying the fairness criteria such as demographic parity and imbalanced classes.

Paper Structure

This paper contains 11 sections, 3 equations, 4 figures, 1 table.

Figures (4)

  • Figure 1: (I) Graph construction from datasets based on instances' similarities. (II) An $N\times N$ adjacency matrix demonstrates the links' weights in the similarity network. (III) The final graph represents dataset entries as a weighted network $\mathcal{N} = \textbf{(V, E)}$, where $\mathbf{V}$ corresponds to the network nodes (vertices), and $\mathbf{E}$ to the links (edges) among them.
  • Figure 2: Imputing the missing value (NULL) process using the features of similar nodes.
  • Figure 3: Schematic view of data augmentation proposed as an application of created similarity network. After deciding about the threshold in keeping the links, the newly added node, y, is linked to nodes i and k with the largest link's weight. The $VL_{y}$ will be resulted from a Vector-Label Propagation algorithm while having eyes in the fairness metrics to be evaluated.
  • Figure 4: The most important features SHAP values, extracted from a RF classifier predicting the pay rate.