Table of Contents
Fetching ...

Is it Possible to Modify Text to a Target Readability Level? An Initial Investigation Using Zero-Shot Large Language Models

Asma Farajidizaji, Vatsal Raina, Mark Gales

TL;DR

This work investigates whether text can be modified to an absolute target readability level, independent of the source text, by introducing a readability-controlled text modification task that generates eight paraphrases per input at predefined FRES levels. It evaluates zero-shot prompting of ChatGPT and Llama-2, including a two-step paraphrasing approach, using both individual- and population-scale metrics to assess readability control and content preservation. While the models can rank paraphrase readability effectively (high Spearman correlations, up to about 0.88), they struggle to hit exact target scores, and larger readability shifts degrade semantic similarity and increase lexical divergence; finetuning a model (Llama-2 with QLoRA) shows limited absolute control but indicates the potential for further improvements. The study highlights the trade-off between readability control and content preservation, providing a framework and metrics for future work in absolute-readability generation and model finetuning.

Abstract

Text simplification is a common task where the text is adapted to make it easier to understand. Similarly, text elaboration can make a passage more sophisticated, offering a method to control the complexity of reading comprehension tests. However, text simplification and elaboration tasks are limited to only relatively alter the readability of texts. It is useful to directly modify the readability of any text to an absolute target readability level to cater to a diverse audience. Ideally, the readability of readability-controlled generated text should be independent of the source text. Therefore, we propose a novel readability-controlled text modification task. The task requires the generation of 8 versions at various target readability levels for each input text. We introduce novel readability-controlled text modification metrics. The baselines for this task use ChatGPT and Llama-2, with an extension approach introducing a two-step process (generating paraphrases by passing through the language model twice). The zero-shot approaches are able to push the readability of the paraphrases in the desired direction but the final readability remains correlated with the original text's readability. We also find greater drops in semantic and lexical similarity between the source and target texts with greater shifts in the readability.

Is it Possible to Modify Text to a Target Readability Level? An Initial Investigation Using Zero-Shot Large Language Models

TL;DR

This work investigates whether text can be modified to an absolute target readability level, independent of the source text, by introducing a readability-controlled text modification task that generates eight paraphrases per input at predefined FRES levels. It evaluates zero-shot prompting of ChatGPT and Llama-2, including a two-step paraphrasing approach, using both individual- and population-scale metrics to assess readability control and content preservation. While the models can rank paraphrase readability effectively (high Spearman correlations, up to about 0.88), they struggle to hit exact target scores, and larger readability shifts degrade semantic similarity and increase lexical divergence; finetuning a model (Llama-2 with QLoRA) shows limited absolute control but indicates the potential for further improvements. The study highlights the trade-off between readability control and content preservation, providing a framework and metrics for future work in absolute-readability generation and model finetuning.

Abstract

Text simplification is a common task where the text is adapted to make it easier to understand. Similarly, text elaboration can make a passage more sophisticated, offering a method to control the complexity of reading comprehension tests. However, text simplification and elaboration tasks are limited to only relatively alter the readability of texts. It is useful to directly modify the readability of any text to an absolute target readability level to cater to a diverse audience. Ideally, the readability of readability-controlled generated text should be independent of the source text. Therefore, we propose a novel readability-controlled text modification task. The task requires the generation of 8 versions at various target readability levels for each input text. We introduce novel readability-controlled text modification metrics. The baselines for this task use ChatGPT and Llama-2, with an extension approach introducing a two-step process (generating paraphrases by passing through the language model twice). The zero-shot approaches are able to push the readability of the paraphrases in the desired direction but the final readability remains correlated with the original text's readability. We also find greater drops in semantic and lexical similarity between the source and target texts with greater shifts in the readability.
Paper Structure (17 sections, 3 equations, 4 figures, 5 tables)

This paper contains 17 sections, 3 equations, 4 figures, 5 tables.

Figures (4)

  • Figure 1: Example for the readability-controlled text modification task. The source text from CLEAR crossley2023large is paraphrased at various target readability levels according to the Flesch reading ease score (FRES) flesch1948new.
  • Figure 2: Distribution of text readability scores.
  • Figure 3: Generated text readability against source text readability as a binned scatterplot.
  • Figure 4: Heatmaps of averaged select variables for each pair of source and target text readability classes. Each cell value is the mean of the specified variable for all texts that have a certain source readability and are modified to a certain target readability.