Table of Contents
Fetching ...

FairPIVARA: Reducing and Assessing Biases in CLIP-Based Multimodal Models

Diego A. B. Moreira, Alef Iury Ferreira, Jhessica Silva, Gabriel Oliveira dos Santos, Luiz Pereira, João Medrado Gondim, Gustavo Bonil, Helena Maia, Nádia da Silva, Simone Tiemi Hashiguti, Jefersson A. dos Santos, Helio Pedrini, Sandra Avila

TL;DR

This paper tackles ethical biases in CLIP-based multimodal models, particularly when adapting to low-resource languages like Portuguese through CAPIVARA. It introduces FairPIVARA, a post-processing embedding-dimension pruning method guided by a bias score $d$ defined from cosine similarities to Good/Bad concept embeddings, and constrained by mutual information $MI(\hat{X}; Y) > \theta$ to preserve utility. Using a bias dataset derived from MMBias and a Portuguese extension, FairPIVARA reduces discriminatory associations by up to $98\%$ with large relative-bias improvements ($\sim 93\%$–$98\%$) and only minor drops in downstream classification performance (typically < $1.5$pp). The approach demonstrates that substantial fairness gains can be achieved without retraining, offering practical implications for deployment of multilingual, vision-language systems and guiding future work on scaling and applying the method to other architectures.

Abstract

Despite significant advancements and pervasive use of vision-language models, a paucity of studies has addressed their ethical implications. These models typically require extensive training data, often from hastily reviewed text and image datasets, leading to highly imbalanced datasets and ethical concerns. Additionally, models initially trained in English are frequently fine-tuned for other languages, such as the CLIP model, which can be expanded with more data to enhance capabilities but can add new biases. The CAPIVARA, a CLIP-based model adapted to Portuguese, has shown strong performance in zero-shot tasks. In this paper, we evaluate four different types of discriminatory practices within visual-language models and introduce FairPIVARA, a method to reduce them by removing the most affected dimensions of feature embeddings. The application of FairPIVARA has led to a significant reduction of up to 98% in observed biases while promoting a more balanced word distribution within the model. Our model and code are available at: https://github.com/hiaac-nlp/FairPIVARA.

FairPIVARA: Reducing and Assessing Biases in CLIP-Based Multimodal Models

TL;DR

This paper tackles ethical biases in CLIP-based multimodal models, particularly when adapting to low-resource languages like Portuguese through CAPIVARA. It introduces FairPIVARA, a post-processing embedding-dimension pruning method guided by a bias score defined from cosine similarities to Good/Bad concept embeddings, and constrained by mutual information to preserve utility. Using a bias dataset derived from MMBias and a Portuguese extension, FairPIVARA reduces discriminatory associations by up to with large relative-bias improvements () and only minor drops in downstream classification performance (typically < pp). The approach demonstrates that substantial fairness gains can be achieved without retraining, offering practical implications for deployment of multilingual, vision-language systems and guiding future work on scaling and applying the method to other architectures.

Abstract

Despite significant advancements and pervasive use of vision-language models, a paucity of studies has addressed their ethical implications. These models typically require extensive training data, often from hastily reviewed text and image datasets, leading to highly imbalanced datasets and ethical concerns. Additionally, models initially trained in English are frequently fine-tuned for other languages, such as the CLIP model, which can be expanded with more data to enhance capabilities but can add new biases. The CAPIVARA, a CLIP-based model adapted to Portuguese, has shown strong performance in zero-shot tasks. In this paper, we evaluate four different types of discriminatory practices within visual-language models and introduce FairPIVARA, a method to reduce them by removing the most affected dimensions of feature embeddings. The application of FairPIVARA has led to a significant reduction of up to 98% in observed biases while promoting a more balanced word distribution within the model. Our model and code are available at: https://github.com/hiaac-nlp/FairPIVARA.
Paper Structure (19 sections, 3 equations, 8 figures, 9 tables)

This paper contains 19 sections, 3 equations, 8 figures, 9 tables.

Figures (8)

  • Figure 1: FairPIVARA integration into traditional vision-language models.
  • Figure 2: Portion of the MMBias dataset and addition of data used for FairPIVARA.
  • Figure 3: Comparative flow of good and bad visual and textual descriptions of concepts, using CAPIVARA as a feature extractor.
  • Figure A1: Relationship between dimension removal and distortion reduction with equivalent removal.
  • Figure A2: Relationship between Theta value and Bias Mitigation with equivalent removal.
  • ...and 3 more figures