Table of Contents
Fetching ...

Evaluating LLMs for Targeted Concept Simplification for Domain-Specific Texts

Sumit Asthana, Hannah Rashkin, Elizabeth Clark, Fantine Huot, Mirella Lapata

TL;DR

This work tackles the challenge of helping skilled adult readers understand domain-specific texts by introducing targeted concept simplification and the WikiDomains dataset, which pairs 22k definitions with a difficult concept within each. It benchmarks multiple LLMs and a dictionary baseline across three rewriting strategies (explain, simplify) and evaluates them via human judgments and automated metrics, finding a reader preference for contextual explanations over lexical substitutions. The results reveal that no single model excels across all dimensions and that automated metrics poorly correlate with human comprehension measures, underscoring the need for personalized, context-aware tools and better evaluation methodologies. The study highlights the potential of context-rich explanations to improve understanding while outlining practical considerations and future directions for evaluating domain-text comprehension support.

Abstract

One useful application of NLP models is to support people in reading complex text from unfamiliar domains (e.g., scientific articles). Simplifying the entire text makes it understandable but sometimes removes important details. On the contrary, helping adult readers understand difficult concepts in context can enhance their vocabulary and knowledge. In a preliminary human study, we first identify that lack of context and unfamiliarity with difficult concepts is a major reason for adult readers' difficulty with domain-specific text. We then introduce "targeted concept simplification," a simplification task for rewriting text to help readers comprehend text containing unfamiliar concepts. We also introduce WikiDomains, a new dataset of 22k definitions from 13 academic domains paired with a difficult concept within each definition. We benchmark the performance of open-source and commercial LLMs and a simple dictionary baseline on this task across human judgments of ease of understanding and meaning preservation. Interestingly, our human judges preferred explanations about the difficult concept more than simplification of the concept phrase. Further, no single model achieved superior performance across all quality dimensions, and automated metrics also show low correlations with human evaluations of concept simplification ($\sim0.2$), opening up rich avenues for research on personalized human reading comprehension support.

Evaluating LLMs for Targeted Concept Simplification for Domain-Specific Texts

TL;DR

This work tackles the challenge of helping skilled adult readers understand domain-specific texts by introducing targeted concept simplification and the WikiDomains dataset, which pairs 22k definitions with a difficult concept within each. It benchmarks multiple LLMs and a dictionary baseline across three rewriting strategies (explain, simplify) and evaluates them via human judgments and automated metrics, finding a reader preference for contextual explanations over lexical substitutions. The results reveal that no single model excels across all dimensions and that automated metrics poorly correlate with human comprehension measures, underscoring the need for personalized, context-aware tools and better evaluation methodologies. The study highlights the potential of context-rich explanations to improve understanding while outlining practical considerations and future directions for evaluating domain-text comprehension support.

Abstract

One useful application of NLP models is to support people in reading complex text from unfamiliar domains (e.g., scientific articles). Simplifying the entire text makes it understandable but sometimes removes important details. On the contrary, helping adult readers understand difficult concepts in context can enhance their vocabulary and knowledge. In a preliminary human study, we first identify that lack of context and unfamiliarity with difficult concepts is a major reason for adult readers' difficulty with domain-specific text. We then introduce "targeted concept simplification," a simplification task for rewriting text to help readers comprehend text containing unfamiliar concepts. We also introduce WikiDomains, a new dataset of 22k definitions from 13 academic domains paired with a difficult concept within each definition. We benchmark the performance of open-source and commercial LLMs and a simple dictionary baseline on this task across human judgments of ease of understanding and meaning preservation. Interestingly, our human judges preferred explanations about the difficult concept more than simplification of the concept phrase. Further, no single model achieved superior performance across all quality dimensions, and automated metrics also show low correlations with human evaluations of concept simplification (), opening up rich avenues for research on personalized human reading comprehension support.

Paper Structure

This paper contains 38 sections, 1 equation, 5 figures, 17 tables.

Figures (5)

  • Figure 1: An example from the dataset, which consists of a definition and a potential difficult concept in the text that a reader may struggle with. The task is to rewrite the definition in a way that simplifies this concept for the reader. (a) Simplifies "digits of precision" to "as many digits as needed", (b) Adds the definition of "digits of precision", (c) Contextually explains that "digits of precision" refers to precision of calculations and how it relates to memory.
  • Figure 2: Results of annotator study: We asked annotators to read complex text for (1) what made the text difficult for them to understand and (2) how they would want a tutor to edit the text to help their understanding.
  • Figure 3: Pearson correlations between automated metrics and human evaluations ($^{***}: p < 0.005, ^{**}: p < 0.05, ^{*}:p < 0.01$).
  • Figure 4: Screenshot of an annotation example for understanding difficulties that readers face with domain specific text.
  • Figure 5: Annotation example for evaluating LLM rewritten definitions for concept simplification.