Table of Contents
Fetching ...

Evaluation of AI Ethics Tools in Language Models: A Developers' Perspective Case Stud

Jhessica Silva, Diego A. B. Moreira, Gabriel O. dos Santos, Alef Ferreira, Helena Maia, Sandra Avila, Helio Pedrini

TL;DR

The paper addresses the challenge of evaluating AI Ethics Tools (AIETs) for language-model deployments by combining a comprehensive literature survey with an empirical, developer-facing case study in Portuguese. It systematically selects four AIETs (Model Cards, ALTAI, FactSheets, Harms Modeling) and applies them to PT-language LMs via interviews with 11 developers, plus a CAPIVARA pilot. Findings show the tools broadly guide ethical considerations but fail to capture language-specific harms like idiomatic expressions and cultural representation; Harms Modeling and Model Cards perform best in identifying risks and producing usable documentation. The study highlights the need for multiple, complementary AIETs and earlier ethical assessments in the AI lifecycle, alongside broader, multidisciplinary evaluation and standardization efforts.

Abstract

In Artificial Intelligence (AI), language models have gained significant importance due to the widespread adoption of systems capable of simulating realistic conversations with humans through text generation. Because of their impact on society, developing and deploying these language models must be done responsibly, with attention to their negative impacts and possible harms. In this scenario, the number of AI Ethics Tools (AIETs) publications has recently increased. These AIETs are designed to help developers, companies, governments, and other stakeholders establish trust, transparency, and responsibility with their technologies by bringing accepted values to guide AI's design, development, and use stages. However, many AIETs lack good documentation, examples of use, and proof of their effectiveness in practice. This paper presents a methodology for evaluating AIETs in language models. Our approach involved an extensive literature survey on 213 AIETs, and after applying inclusion and exclusion criteria, we selected four AIETs: Model Cards, ALTAI, FactSheets, and Harms Modeling. For evaluation, we applied AIETs to language models developed for the Portuguese language, conducting 35 hours of interviews with their developers. The evaluation considered the developers' perspective on the AIETs' use and quality in helping to identify ethical considerations about their model. The results suggest that the applied AIETs serve as a guide for formulating general ethical considerations about language models. However, we note that they do not address unique aspects of these models, such as idiomatic expressions. Additionally, these AIETs did not help to identify potential negative impacts of models for the Portuguese language.

Evaluation of AI Ethics Tools in Language Models: A Developers' Perspective Case Stud

TL;DR

The paper addresses the challenge of evaluating AI Ethics Tools (AIETs) for language-model deployments by combining a comprehensive literature survey with an empirical, developer-facing case study in Portuguese. It systematically selects four AIETs (Model Cards, ALTAI, FactSheets, Harms Modeling) and applies them to PT-language LMs via interviews with 11 developers, plus a CAPIVARA pilot. Findings show the tools broadly guide ethical considerations but fail to capture language-specific harms like idiomatic expressions and cultural representation; Harms Modeling and Model Cards perform best in identifying risks and producing usable documentation. The study highlights the need for multiple, complementary AIETs and earlier ethical assessments in the AI lifecycle, alongside broader, multidisciplinary evaluation and standardization efforts.

Abstract

In Artificial Intelligence (AI), language models have gained significant importance due to the widespread adoption of systems capable of simulating realistic conversations with humans through text generation. Because of their impact on society, developing and deploying these language models must be done responsibly, with attention to their negative impacts and possible harms. In this scenario, the number of AI Ethics Tools (AIETs) publications has recently increased. These AIETs are designed to help developers, companies, governments, and other stakeholders establish trust, transparency, and responsibility with their technologies by bringing accepted values to guide AI's design, development, and use stages. However, many AIETs lack good documentation, examples of use, and proof of their effectiveness in practice. This paper presents a methodology for evaluating AIETs in language models. Our approach involved an extensive literature survey on 213 AIETs, and after applying inclusion and exclusion criteria, we selected four AIETs: Model Cards, ALTAI, FactSheets, and Harms Modeling. For evaluation, we applied AIETs to language models developed for the Portuguese language, conducting 35 hours of interviews with their developers. The evaluation considered the developers' perspective on the AIETs' use and quality in helping to identify ethical considerations about their model. The results suggest that the applied AIETs serve as a guide for formulating general ethical considerations about language models. However, we note that they do not address unique aspects of these models, such as idiomatic expressions. Additionally, these AIETs did not help to identify potential negative impacts of models for the Portuguese language.

Paper Structure

This paper contains 26 sections, 7 figures, 10 tables.

Figures (7)

  • Figure 1: AIETs can be applied throughout the AI life-cycle, from problem specification to the use stage. Colored hexagons indicate examples of AIETs types that can be applied in each of the stages. In purple, full border, the categories of AIETs. In blue, dotted border, the objectives. The AIETs analyzed in this study are exclusively applied in the use stage of the AI life-cycle, indicated by the solid red border
  • Figure 2: Methodology overview. The initial stage of this project involved conducting a bibliographic survey of AIETs and selecting the specific AIETs for evaluation (Fig. \ref{['fig:m1']}). Following the selection, we assembled the interview scripts from the AIETs questions and carried out an initial test. Once we defined the protocol, we submitted a project to the Research Ethics Committee to obtain authorization to conduct the study, given that it involved interviews with human participants. After securing the Ethics Committee's approval, we carried out a bibliographic survey to select language models in Portuguese and invite developers to participate in the study (Fig. \ref{['fig:m2']}). Finally, we conducted the interviews and the AIETs were assessed by the developers (Fig. \ref{['fig:m3']})
  • Figure 3: Language models developed for the Portuguese language releases from 2020 to March 2025: GPorTuguese-2 pierre2020gpt2smallportuguese, PTT5 ptt5, BERTimbau bertimbau, BioBERTpt biobertpt, BERTáu bertau, GPT2-Bio-Pt gpt-2bio, BERTikal polo2021legalnlpnaturallanguage, PetroBERT petrobert, Sabiá sabia, Albertina PT-* albertina, CardioBERTpt cardiobertpt, JurisBERT JurisBERT, Cabrita cabrita, BERTabaporu bertabaporu, DeBERTinha debertinha, CAPIVARA santos2023capivara, LegalBert-pt legalbert-pt, BODE bode, TeenyTinyLlama teenytinyllama, PeLLE demello2024pelle, GlórIA gloria, Gervásio PT* santos_advancing_2024, RoBERTaLexPT robertalexpt, Sabiá-2 almeida2024sabi, Juru junior2024juru, PTT5-v2 piau_ptt5-v2_2025, Serafim PT* gomes_open_2025, MediAlbertina nunes_medialbertina_2024, Sabiá-3abonizio_sabia-3_2025, Tucano correa_tucano_2024, Carvalho pt-gl gamallo_galician-portuguese_2025, V-Glória simplicio_v-gloria_2024, GemBode e PhiBode garcia_gembode_2025, BERTweet.BR carneiro_bertweetbr_2025, GovBERT-BR silva_govbert-br_2025, and DeB3RTa pires_deb3rta_2025. Brazilian Portuguese, European Portuguese, Both, Brazilian and European Portuguese
  • Figure 4: Evaluation of the AIETs from the perspective of the developers interviewed. Here, the graphs do not include the "neutral" option. : "Agree", : "Weakly agree", : "Weakly disagree", and : "Disagree"
  • Figure 5: Classification of AIETs through general questions. : Model Cards, : ALTAI, : FactSheets, and : Harms Modeling
  • ...and 2 more figures