SciGisPy: a Novel Metric for Biomedical Text Simplification via Gist Inference Score
Chen Lyu, Gabriele Pergola
TL;DR
The paper tackles the challenge of evaluating automatic text simplification in the biomedical domain, where general metrics fail to capture gist comprehension. It introduces SciGisPy, a GIS-inspired, domain-adapted metric that incorporates semantic chunking, Information Content-based hypernym measures, and biomedical embeddings to better assess how simplified text facilitates gist inferences. Through extensive ablation on the Cochrane dataset, SciGisPy outperforms the original GIS formulation, with a notable increase in correctly identified simplified texts (84% on the test set vs 44.8% initially reported). The work demonstrates that tailoring gist-based evaluation to biomedical content yields more accurate assessments of simplification quality and has practical implications for designing accessible biomedical communications.
Abstract
Biomedical literature is often written in highly specialized language, posing significant comprehension challenges for non-experts. Automatic text simplification (ATS) offers a solution by making such texts more accessible while preserving critical information. However, evaluating ATS for biomedical texts is still challenging due to the limitations of existing evaluation metrics. General-domain metrics like SARI, BLEU, and ROUGE focus on surface-level text features, and readability metrics like FKGL and ARI fail to account for domain-specific terminology or assess how well the simplified text conveys core meanings (gist). To address this, we introduce SciGisPy, a novel evaluation metric inspired by Gist Inference Score (GIS) from Fuzzy-Trace Theory (FTT). SciGisPy measures how well a simplified text facilitates the formation of abstract inferences (gist) necessary for comprehension, especially in the biomedical domain. We revise GIS for this purpose by introducing domain-specific enhancements, including semantic chunking, Information Content (IC) theory, and specialized embeddings, while removing unsuitable indexes. Our experimental evaluation on the Cochrane biomedical text simplification dataset demonstrates that SciGisPy outperforms the original GIS formulation, with a significant increase in correctly identified simplified texts (84% versus 44.8%). The results and a thorough ablation study confirm that SciGisPy better captures the essential meaning of biomedical content, outperforming existing approaches.
