Digital Comprehensibility Assessment of Simplified Texts among Persons with Intellectual Disabilities

Andreas Säuberli; Franz Holzknecht; Patrick Haller; Silvana Deilen; Laura Schiffl; Silvia Hansen-Schirra; Sarah Ebling

Digital Comprehensibility Assessment of Simplified Texts among Persons with Intellectual Disabilities

Andreas Säuberli, Franz Holzknecht, Patrick Haller, Silvana Deilen, Laura Schiffl, Silvia Hansen-Schirra, Sarah Ebling

TL;DR

This study addresses how to evaluate the comprehensibility of text simplifications for primary target groups, specifically persons with intellectual disabilities, using a digital, tablet-based reading platform. It compares unsimplified, manually simplified, and automatically simplified German texts across two reader groups (ID and non-ID) and analyzes multiple measures—comprehension questions, perceived difficulty, response times, and reading speed—within a Bayesian Rasch framework with three facets. Key findings show that manual simplification more reliably improves objective comprehension for the target group, while automatic simplification yields smaller or mixed effects and may be harder for control readers; cognitive and behavioral data reveal substantial heterogeneity and skimming tendencies in the ID group. The work demonstrates the viability of digital, interaction-level assessments for inclusive evaluation of automatic text simplification and underscores the need to tailor evaluation methods to primary target groups, combining subjective, objective, and behavioral metrics to capture comprehensibility more accurately.

Abstract

Text simplification refers to the process of increasing the comprehensibility of texts. Automatic text simplification models are most commonly evaluated by experts or crowdworkers instead of the primary target groups of simplified texts, such as persons with intellectual disabilities. We conducted an evaluation study of text comprehensibility including participants with and without intellectual disabilities reading unsimplified, automatically and manually simplified German texts on a tablet computer. We explored four different approaches to measuring comprehensibility: multiple-choice comprehension questions, perceived difficulty ratings, response time, and reading speed. The results revealed significant variations in these measurements, depending on the reader group and whether the text had undergone automatic or manual simplification. For the target group of persons with intellectual disabilities, comprehension questions emerged as the most reliable measure, while analyzing reading speed provided valuable insights into participants' reading behavior.

Digital Comprehensibility Assessment of Simplified Texts among Persons with Intellectual Disabilities

TL;DR

Abstract

Paper Structure (26 sections, 3 figures, 2 tables)

This paper contains 26 sections, 3 figures, 2 tables.

Introduction
Related work
Human evaluation of automatic text simplification
Comprehension of simplified language by persons with intellectual disabilities
Digital assessment of reading comprehension
Materials and methods
Texts and comprehension questions
Participants
Target group
Control group
Procedure
Cognitive tasks
Reading tasks
Statistical analysis
Results
...and 11 more sections

Figures (3)

Figure 1: Screenshots of the reading task in Okra. (1) Initial reading screen, where only the text is visible. (2) Text difficulty rating screen. (3) Comprehension question screen; tapping the arrow buttons switches between questions.
Figure 2: Boxplots of the measurements from the cognitive tasks, compared between target and control group. Each data point is the measured values for a single participant aggregated across all trials/stimuli (maximum for digit span, mean for all others), excluding practice trials.
Figure 3: Posterior distributions of the text version parameters for the four measurements in the reading task. Points are medians, error bars are 80%, 90%, and 95% credible intervals (CI). A bracket with $\blacktriangle$ indicates that the 80% CI of the difference between the two parameters does not include zero (i.e., we are 80% confident that there is a difference). Similarly with $\blacktriangle\blacktriangle$ for 90% CI and $\blacktriangle\blacktriangle\blacktriangle$ for 95% CI.

Digital Comprehensibility Assessment of Simplified Texts among Persons with Intellectual Disabilities

TL;DR

Abstract

Digital Comprehensibility Assessment of Simplified Texts among Persons with Intellectual Disabilities

Authors

TL;DR

Abstract

Table of Contents

Figures (3)