Table of Contents
Fetching ...

Memory Undone: Between Knowing and Not Knowing in Data Systems

Viktoriia Makovska, George Fletcher, Julia Stoyanovich, Tetiana Zakharchenko

TL;DR

This work argues that "forgetting" in computational systems cannot be reduced to a single operation and should instead be treated as a sociotechnical practice with distinct mechanisms and consequences, and proposes reframing unlearning as a first-class capability in knowledge infrastructures.

Abstract

Machine learning and data systems increasingly function as infrastructures of memory: they ingest, store, and operationalize traces of personal, political, and cultural life. Yet contemporary governance demands credible forms of forgetting, from GDPR-backed deletion to harm-mitigation and the removal of manipulative content, while technical infrastructures are optimized to retain, replicate, and reuse. This work argues that "forgetting" in computational systems cannot be reduced to a single operation (e.g., record deletion) and should instead be treated as a sociotechnical practice with distinct mechanisms and consequences. We clarify a vocabulary that separates erasure (removing or disabling access to data artifacts), unlearning (interventions that bound or remove a data point influence on learned parameters and outputs), exclusion (upstream non-collection and omission), and forgetting as an umbrella term spanning agency, temporality, reversibility, and scale. Building on examples from machine unlearning, semantic dependencies in data management, participatory data modeling, and manipulation at scale, we show how forgetting can simultaneously protect rights and enable silencing. We propose reframing unlearning as a first-class capability in knowledge infrastructures, evaluated not only by compliance or utility retention, but by its governance properties: transparency, accountability, and epistemic justice.

Memory Undone: Between Knowing and Not Knowing in Data Systems

TL;DR

This work argues that "forgetting" in computational systems cannot be reduced to a single operation and should instead be treated as a sociotechnical practice with distinct mechanisms and consequences, and proposes reframing unlearning as a first-class capability in knowledge infrastructures.

Abstract

Machine learning and data systems increasingly function as infrastructures of memory: they ingest, store, and operationalize traces of personal, political, and cultural life. Yet contemporary governance demands credible forms of forgetting, from GDPR-backed deletion to harm-mitigation and the removal of manipulative content, while technical infrastructures are optimized to retain, replicate, and reuse. This work argues that "forgetting" in computational systems cannot be reduced to a single operation (e.g., record deletion) and should instead be treated as a sociotechnical practice with distinct mechanisms and consequences. We clarify a vocabulary that separates erasure (removing or disabling access to data artifacts), unlearning (interventions that bound or remove a data point influence on learned parameters and outputs), exclusion (upstream non-collection and omission), and forgetting as an umbrella term spanning agency, temporality, reversibility, and scale. Building on examples from machine unlearning, semantic dependencies in data management, participatory data modeling, and manipulation at scale, we show how forgetting can simultaneously protect rights and enable silencing. We propose reframing unlearning as a first-class capability in knowledge infrastructures, evaluated not only by compliance or utility retention, but by its governance properties: transparency, accountability, and epistemic justice.
Paper Structure (3 sections)