A Conceptual Framework for Ethical Evaluation of Machine Learning Systems

Neha R. Gupta; Jessica Hullman; Hari Subramonyam

A Conceptual Framework for Ethical Evaluation of Machine Learning Systems

Neha R. Gupta, Jessica Hullman, Hari Subramonyam

TL;DR

The paper tackles the problem that ethical harms can arise during ML evaluation, not only after deployment. It proposes a utility-based approach where evaluation decisions are ranked by $u(a)=\mathbb{E}[\mathrm{IG}(a)]-\mathbb{E}[\mathrm{EH}(a)]-\mathbb{E}[\mathrm{cost}]$, with $EH(a)=\sum_j w_j \mathbb{E}[\mathrm{EH}_j(a)]$ and $a^*=\arg\max_{a\in A} u(a)$, treating information gain, ethical harms, and costs as forecasted expectations under uncertainty. The framework emphasizes the need to balance information value with ethical risk and resource constraints, and it draws guidance from analogies to clinical trials and automotive testing to propose governance structures such as external review boards and regulatory risk assessments. The work aims to catalyze policy-oriented discussion and future research on designing ethical, auditable ML evaluations that guide responsible development and deployment.

Abstract

Research in Responsible AI has developed a range of principles and practices to ensure that machine learning systems are used in a manner that is ethical and aligned with human values. However, a critical yet often neglected aspect of ethical ML is the ethical implications that appear when designing evaluations of ML systems. For instance, teams may have to balance a trade-off between highly informative tests to ensure downstream product safety, with potential fairness harms inherent to the implemented testing procedures. We conceptualize ethics-related concerns in standard ML evaluation techniques. Specifically, we present a utility framework, characterizing the key trade-off in ethical evaluation as balancing information gain against potential ethical harms. The framework is then a tool for characterizing challenges teams face, and systematically disentangling competing considerations that teams seek to balance. Differentiating between different types of issues encountered in evaluation allows us to highlight best practices from analogous domains, such as clinical trials and automotive crash testing, which navigate these issues in ways that can offer inspiration to improve evaluation processes in ML. Our analysis underscores the critical need for development teams to deliberately assess and manage ethical complexities that arise during the evaluation of ML systems, and for the industry to move towards designing institutional policies to support ethical evaluations.

A Conceptual Framework for Ethical Evaluation of Machine Learning Systems

TL;DR

The paper tackles the problem that ethical harms can arise during ML evaluation, not only after deployment. It proposes a utility-based approach where evaluation decisions are ranked by

, with

and

, treating information gain, ethical harms, and costs as forecasted expectations under uncertainty. The framework emphasizes the need to balance information value with ethical risk and resource constraints, and it draws guidance from analogies to clinical trials and automotive testing to propose governance structures such as external review boards and regulatory risk assessments. The work aims to catalyze policy-oriented discussion and future research on designing ethical, auditable ML evaluations that guide responsible development and deployment.

Abstract

Paper Structure (14 sections, 6 equations, 2 tables)

This paper contains 14 sections, 6 equations, 2 tables.

Introduction
Related Works
Ethical AI
ML System Evaluation Practices
Ethical Evaluation Model
Motivation
Model Properties
Discussion
Recommendations from the model
Alternative conceptualizations
Conclusion
Acknowledgements
Appendix
Challenges in Balancing Evaluation Considerations

A Conceptual Framework for Ethical Evaluation of Machine Learning Systems

TL;DR

Abstract

A Conceptual Framework for Ethical Evaluation of Machine Learning Systems

Authors

TL;DR

Abstract

Table of Contents