Representation Fidelity:Auditing Algorithmic Decisions About Humans Using Self-Descriptions

Theresa Elstner; Martin Potthast

Representation Fidelity:Auditing Algorithmic Decisions About Humans Using Self-Descriptions

Theresa Elstner, Martin Potthast

TL;DR

This paper presents the first benchmark for evaluating representation fidelity based on a dataset of loan-granting decisions, and examines the nature of discrepancies between these representations, how such discrepancies can be quantified, and derive a generic typology of representation mismatches that determine the degree of representation fidelity.

Abstract

This paper introduces a new dimension for validating algorithmic decisions about humans by measuring the fidelity of their representations. Representation Fidelity measures if decisions about a person rest on reasonable grounds. We propose to operationalize this notion by measuring the distance between two representations of the same person: (1) an externally prescribed input representation on which the decision is based, and (2) a self-description provided by the human subject of the decision, used solely to validate the input representation. We examine the nature of discrepancies between these representations, how such discrepancies can be quantified, and derive a generic typology of representation mismatches that determine the degree of representation fidelity. We further present the first benchmark for evaluating representation fidelity based on a dataset of loan-granting decisions. Our Loan-Granting Self-Representations Corpus 2025 consists of a large corpus of 30 000 synthetic natural language self-descriptions derived from corresponding representations of applicants in the German Credit Dataset, along with expert annotations of representation mismatches between each pair of representations.

Representation Fidelity:Auditing Algorithmic Decisions About Humans Using Self-Descriptions

TL;DR

Abstract

Paper Structure (38 sections, 2 equations, 3 figures, 4 tables)

This paper contains 38 sections, 2 equations, 3 figures, 4 tables.

Introduction
Related Work
Constructs and Observations
Human-targeted Decision Systems
Representations
Representational Alignment
Self-Representations
Representation Learning
Mitigating Representational Harms
Transparency, Mutability, Fairness
The German Credit Dataset
A Qualitative Analysis Approach to Representation Fidelity
Typology of Representation Mismatches
Relevance for Representation Fidelity.
Requirements for Self-Descriptions
...and 23 more sections

Figures (3)

Figure 1: Overview of our approach to analyze representation fidelity. Top row: An algorithmic decision system typically implements a standard classification pipeline. An individual $v$ is represented as $\mathbf{x}$ using a representation function $\alpha_1()$. The representation $\mathbf{x}$ is then fed to the classifier $c()$ of an algorithmic decision system, which outputs the decision $c(\mathbf{x})$. Bottom row: Our approach to analyze representation fidelity represents $v$ as a natural language self-description $\mathbf{d}$ using a representation function $\alpha_2()$. The representation $\mathbf{d}$ is then compared to $\mathbf{x}$ manually and/or using a distance measure $\delta()$ to obtain a qualitative and/or a quantitative representation mismatch for the decision $c(\mathbf{x})$ about $v$.
Figure 2: This example illustrates our annotation method for representation mismatch analysis by comparing an input representation $\mathbf{x}$ (left) to a generated natural-language self-description $\mathbf{d}$ (right). We first segment $\mathbf{d}$ into information-bearing units, aiming for exhaustive semantic coverage. We then assign to each unit one or more labels from four label types: (1) a label from $\mathbf{x}$'s feature set whenever a semantic correspondence exists, (2) an aspect label for any minimal information-bearing unit, (3) a new subject label for semantically coherent information units recurring across different self-descriptions, and (4) a specialization label whenever a feature in $\mathbf{x}$ is described in more detail in $\mathbf{d}$.
Figure 3: This visualization of similarities of embedded self-descriptions as implemented in ren:2025. Self-descriptions form clusters depending the large language model that generated them (gpt-4.1, gpt-4o, o3, llama-3.3-70b-versatile, moonshotai/kimi-k2-instruct, qwen/qwen3-32b).

Representation Fidelity:Auditing Algorithmic Decisions About Humans Using Self-Descriptions

TL;DR

Abstract

Representation Fidelity:Auditing Algorithmic Decisions About Humans Using Self-Descriptions

Authors

TL;DR

Abstract

Table of Contents

Figures (3)