Assessing Human Rights Risks in AI: A Framework for Model Evaluation
Vyoma Raman, Camille Chabot, Betsy Popken
TL;DR
The paper presents a rights-based, three-part framework for evaluating human rights risks in generative AI, anchored in the UN Guiding Principles on Business and Human Rights. It operationalizes task selection, metric design, and rights analysis to assess how model outputs impact the rights to information access and freedom of thought in deployment contexts. A case study on large language models in political news journalism demonstrates how to design context-specific tasks, metrics, and benchmarking to reveal rights-related harms. The findings show model performance varies across correction, framing, and identity representation, underscoring the need for context-aware, multi-metric auditing to guide governance and responsible deployment in sensitive domains.
Abstract
The Universal Declaration of Human Rights and other international agreements outline numerous inalienable rights that apply across geopolitical boundaries. As generative AI becomes increasingly prevalent, it poses risks to human rights such as non-discrimination, health, and security, which are also central concerns for AI researchers focused on fairness and safety. We contribute to the field of algorithmic auditing by presenting a framework to computationally assess human rights risk. Drawing on the UN Guiding Principles on Business and Human Rights, we develop an approach to evaluating a model to make grounded claims about the level of risk a model poses to particular human rights. Our framework consists of three parts: selecting tasks that are likely to pose human rights risks within a given context, designing metrics to measure the scope, scale, and likelihood of potential risks from that task, and analyzing rights with respect to the values of those metrics. Because a human rights approach centers on real-world harms, it requires evaluating AI systems in the specific contexts in which they are deployed. We present a case study of large language models in political news journalism, demonstrating how our framework helps to design an evaluation and benchmarking different models. We then discuss the implications of the results for the rights of access to information and freedom of thought and broader considerations for adopting this approach.
