Table of Contents
Fetching ...

CREDAL: Close Reading of Data Models

George Fletcher, Olha Nahurna, Matvii Prytula, Julia Stoyanovich

TL;DR

The paper addresses how data schemas encode social and political conditions by introducing CREDAL, a structured close-reading methodology inspired by literary analysis. It details an iterative development program—including literature-grounded adaptation, supplemental materials, and qualitative validation with students and researchers—to assess usability, usefulness, and impact on data-work practices. The results indicate that CREDAL is learnable, improves data-schematic understanding and proficiency, and is likely to be adopted in practice, while also revealing opportunities for improvement, automation, and domain-expert collaboration. The work contributes a first systematic method for critically examining data schemas and highlights the practical significance of recognizing biases and governance considerations in data systems. It lays groundwork for broader adoption and extension to non-relational data contexts, automation, and customized close-reading workflows.

Abstract

Data models are necessary for the birth of data and of any data-driven system. Indeed, every algorithm, every machine learning model, every statistical model, and every database has an underlying data model without which the system would not be usable. Hence, data models are excellent sites for interrogating the (material, social, political, ...) conditions giving rise to a data system. Towards this, drawing inspiration from literary criticism, we propose to closely read data models in the same spirit as we closely read literary artifacts. Close readings of data models reconnect us with, among other things, the materiality, the genealogies, the techne, the closed nature, and the design of technical systems. While recognizing from literary theory that there is no one correct way to read, it is nonetheless critical to have systematic guidance for those unfamiliar with close readings. This is especially true for those trained in the computing and data sciences, who too often are enculturated to set aside the socio-political aspects of data work. A systematic methodology for reading data models currently does not exist. To fill this gap, we present the CREDAL methodology for close readings of data models. We detail our iterative development process and present results of a qualitative evaluation of CREDAL demonstrating its usability, usefulness, and effectiveness in the critical study of data.

CREDAL: Close Reading of Data Models

TL;DR

The paper addresses how data schemas encode social and political conditions by introducing CREDAL, a structured close-reading methodology inspired by literary analysis. It details an iterative development program—including literature-grounded adaptation, supplemental materials, and qualitative validation with students and researchers—to assess usability, usefulness, and impact on data-work practices. The results indicate that CREDAL is learnable, improves data-schematic understanding and proficiency, and is likely to be adopted in practice, while also revealing opportunities for improvement, automation, and domain-expert collaboration. The work contributes a first systematic method for critically examining data schemas and highlights the practical significance of recognizing biases and governance considerations in data systems. It lays groundwork for broader adoption and extension to non-relational data contexts, automation, and customized close-reading workflows.

Abstract

Data models are necessary for the birth of data and of any data-driven system. Indeed, every algorithm, every machine learning model, every statistical model, and every database has an underlying data model without which the system would not be usable. Hence, data models are excellent sites for interrogating the (material, social, political, ...) conditions giving rise to a data system. Towards this, drawing inspiration from literary criticism, we propose to closely read data models in the same spirit as we closely read literary artifacts. Close readings of data models reconnect us with, among other things, the materiality, the genealogies, the techne, the closed nature, and the design of technical systems. While recognizing from literary theory that there is no one correct way to read, it is nonetheless critical to have systematic guidance for those unfamiliar with close readings. This is especially true for those trained in the computing and data sciences, who too often are enculturated to set aside the socio-political aspects of data work. A systematic methodology for reading data models currently does not exist. To fill this gap, we present the CREDAL methodology for close readings of data models. We detail our iterative development process and present results of a qualitative evaluation of CREDAL demonstrating its usability, usefulness, and effectiveness in the critical study of data.

Paper Structure

This paper contains 58 sections, 5 figures, 1 table.

Figures (5)

  • Figure 1: Data model for Example \ref{['ex:1']}. In this data model, Person is a class and Knows is a relationship between elements of this class; furthermore, instances of class Person have "name", "dob", and "race" attributes, and Knows relationships have a "since" attribute. As an example data instance of this model: Saori is a Person (name "Saori", race "Asian", and date of birth 2001) who Knows Kotaro (who is also a Person, name "Kotaro", race "Native Hawaiian or Other Pacific Islander", and date of birth 2002) since 2018.
  • Figure 2: Visual Representation of CREDAL
  • Figure 3: Word cloud of all participant responses during the interviews
  • Figure 4: Interviews codes frequency distribution
  • Figure 5: Data Model for Example Reading, based on erd