Table of Contents
Fetching ...

Representing data in words

Amandine M. Caut, Amy Rouillard, Beimnet Zenebe, Matthias Green, Ágúst Pálmason Morthens, David J. T. Sumpter

TL;DR

The paper addresses turning structured data into readable wordalisations using large language models, proposing a four-step prompt design (tell it who it is, tell it what it knows, tell it what data to use, tell it how to answer) paired with a normative statistical model based on $z$-scores to translate numbers into descriptive text. It argues that model cards provide a principled framework for transparency, ethics, and reproducibility, rather than relying on benchmarks. The authors demonstrate applications in football scouting, personality testing, and World Values Survey analysis, supported by a Streamlit platform that exposes prompts and outputs. Evaluation combines automated reconstruction of statistical descriptors from wordalisations with multi-run prompts to assess fidelity, showing that including data-driven descriptors improves alignment with the underlying data across domains, while also acknowledging limitations of automated evaluation and prompting choices.

Abstract

An important part of data science is the use of visualisations to display data in a way that is easy to digest. Visualisations often rely on underlying statistical or machine learning models -- ranging from basic calculations like category means to advanced methods such as principal component analysis of multidimensional datasets -- to convey insights. We introduce an analogous concept for word descriptions of data, which we call wordalisations. Wordalisations describe data in easy to digest words, without necessarily reporting numerical values from the data. We show how to create wordalisations using large language models, through prompt templates engineered according to a task-agnostic structure which can be used to automatically generate prompts from data. We show how to produce reliable and engaging texts on three application areas: scouting football players, personality tests, and international survey data. Using the model cards framework, we emphasise the importance of clearly stating the model we are imposing on the data when creating the wordalisation, detailing how numerical values are translated into words, incorporating background information into prompts for the large language model, and documenting the limitations of the wordalisations. We argue that our model cards approach is a more appropriate framework for setting best practices in wordalisation of data than performance tests on benchmark datasets.

Representing data in words

TL;DR

The paper addresses turning structured data into readable wordalisations using large language models, proposing a four-step prompt design (tell it who it is, tell it what it knows, tell it what data to use, tell it how to answer) paired with a normative statistical model based on -scores to translate numbers into descriptive text. It argues that model cards provide a principled framework for transparency, ethics, and reproducibility, rather than relying on benchmarks. The authors demonstrate applications in football scouting, personality testing, and World Values Survey analysis, supported by a Streamlit platform that exposes prompts and outputs. Evaluation combines automated reconstruction of statistical descriptors from wordalisations with multi-run prompts to assess fidelity, showing that including data-driven descriptors improves alignment with the underlying data across domains, while also acknowledging limitations of automated evaluation and prompting choices.

Abstract

An important part of data science is the use of visualisations to display data in a way that is easy to digest. Visualisations often rely on underlying statistical or machine learning models -- ranging from basic calculations like category means to advanced methods such as principal component analysis of multidimensional datasets -- to convey insights. We introduce an analogous concept for word descriptions of data, which we call wordalisations. Wordalisations describe data in easy to digest words, without necessarily reporting numerical values from the data. We show how to create wordalisations using large language models, through prompt templates engineered according to a task-agnostic structure which can be used to automatically generate prompts from data. We show how to produce reliable and engaging texts on three application areas: scouting football players, personality tests, and international survey data. Using the model cards framework, we emphasise the importance of clearly stating the model we are imposing on the data when creating the wordalisation, detailing how numerical values are translated into words, incorporating background information into prompts for the large language model, and documenting the limitations of the wordalisations. We argue that our model cards approach is a more appropriate framework for setting best practices in wordalisation of data than performance tests on benchmark datasets.

Paper Structure

This paper contains 11 sections, 1 equation, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Example from the football scout application. The wordalisation appears below the data visualisation followed by a user input and the generated response.
  • Figure 2: Example from the personality test (left) and international survey (right) applications. The wordalisation appears below the visualisation followed by a user input and generated response.
  • Figure 3: Comparison of the class labels generated by the normative model with classes reconstructed from the wordalisations. For each application, football scout (left), international survey (middle) and personality test (right), multiple wordalisations were generated for each data point, so that at least $10$ valid reconstructions per data point were found, and the mean accuracy is taken over all wordalistations. We compare the accuracy of the model for two different prompts, one in which data in the form of synthetic texts was given (blue) and in the other the data was omitted (red). The dashed line indicates the expected accuracy if the class labels were randomly chosen according to a uniform probability distribution and lie at an accuracy of $\frac{1}{6}$, $\frac{1}{5}$ and $\frac{1}{2}$ for each application respectively.