AI data transparency: an exploration through the lens of AI incidents

Sophia Worth; Ben Snaith; Arunav Das; Gefion Thuermer; Elena Simperl

AI data transparency: an exploration through the lens of AI incidents

Sophia Worth, Ben Snaith, Arunav Das, Gefion Thuermer, Elena Simperl

TL;DR

It is demonstrated that low data transparency persists across a wide range of systems, and further that issues of transparency and explainability at model- and system- level create barriers for investigating data transparency information to address public concerns about AI systems.

Abstract

Knowing more about the data used to build AI systems is critical for allowing different stakeholders to play their part in ensuring responsible and appropriate deployment and use. Meanwhile, a 2023 report shows that data transparency lags significantly behind other areas of AI transparency in popular foundation models. In this research, we sought to build on these findings, exploring the status of public documentation about data practices within AI systems generating public concern. Our findings demonstrate that low data transparency persists across a wide range of systems, and further that issues of transparency and explainability at model- and system- level create barriers for investigating data transparency information to address public concerns about AI systems. We highlight a need to develop systematic ways of monitoring AI data transparency that account for the diversity of AI system types, and for such efforts to build on further understanding of the needs of those both supplying and using data transparency information.

AI data transparency: an exploration through the lens of AI incidents

TL;DR

Abstract

Paper Structure (26 sections, 4 figures, 2 tables)

This paper contains 26 sections, 4 figures, 2 tables.

Introduction
Background
Multiple forms of AI data transparency
Who needs AI data transparency information?
Where are we now?
Methodology
Research question
Data: AI incidents for identifying AI systems causing public concern
Methods
Filtering sample of AI systems
Analysing AI data transparency
Applying the search protocol
Findings
Identifying models and model transparency information
Assessing data transparency within identifiable models
...and 11 more sections

Figures (4)

Figure 1: Figure from Bommanasi et al. (2023), in their Foundation Model Transparency Index paper comparing transparency across 10 key foundation models and 10 aspects of AI ecosystem transparency. Their ‘data layer’ includes data, labour and compute factors.
Figure 2: Overview of methodology
Figure 3: AI models scoring a point for each data indicator in this research (n=25) in comparison to the findings of the Foundation Model Transparency Index (n=10) (bommasani_foundation_2023
Figure 4: Comparing number of data transparency indicators across all AI models analysed

AI data transparency: an exploration through the lens of AI incidents

TL;DR

Abstract

AI data transparency: an exploration through the lens of AI incidents

Authors

TL;DR

Abstract

Table of Contents

Figures (4)