Table of Contents
Fetching ...

REFeREE: A REference-FREE Model-Based Metric for Text Simplification

Yichen Huang, Ekaterina Kochmar

TL;DR

REFeREE introduces a reference-free, model-based metric for text simplification built on a 3-stage curriculum that starts with scalable reference-free pretraining, augments with reference-based supervision, and ends with fine-tuning on human ratings. The method uses a DeBERTa-v3-base backbone with separate regression heads for multiple supervision signals spanning meaning preservation, fluency, and simplicity, and it leverages both unlabeled data and aligned TS corpora. Empirical results show REFeREE outperforms traditional reference-based metrics on overall quality assessment and remains competitive on specific aspects, with notable gains when using a RoBERTa backbone. The work highlights the usefulness of large-scale pretraining, multi-signal supervision, and targeted fine-tuning for robust, scalable TS evaluation, while acknowledging limitations and suggesting directions for future expansion to other languages and document-level simplification.

Abstract

Text simplification lacks a universal standard of quality, and annotated reference simplifications are scarce and costly. We propose to alleviate such limitations by introducing REFeREE, a reference-free model-based metric with a 3-stage curriculum. REFeREE leverages an arbitrarily scalable pretraining stage and can be applied to any quality standard as long as a small number of human annotations are available. Our experiments show that our metric outperforms existing reference-based metrics in predicting overall ratings and reaches competitive and consistent performance in predicting specific ratings while requiring no reference simplifications at inference time.

REFeREE: A REference-FREE Model-Based Metric for Text Simplification

TL;DR

REFeREE introduces a reference-free, model-based metric for text simplification built on a 3-stage curriculum that starts with scalable reference-free pretraining, augments with reference-based supervision, and ends with fine-tuning on human ratings. The method uses a DeBERTa-v3-base backbone with separate regression heads for multiple supervision signals spanning meaning preservation, fluency, and simplicity, and it leverages both unlabeled data and aligned TS corpora. Empirical results show REFeREE outperforms traditional reference-based metrics on overall quality assessment and remains competitive on specific aspects, with notable gains when using a RoBERTa backbone. The work highlights the usefulness of large-scale pretraining, multi-signal supervision, and targeted fine-tuning for robust, scalable TS evaluation, while acknowledging limitations and suggesting directions for future expansion to other languages and document-level simplification.

Abstract

Text simplification lacks a universal standard of quality, and annotated reference simplifications are scarce and costly. We propose to alleviate such limitations by introducing REFeREE, a reference-free model-based metric with a 3-stage curriculum. REFeREE leverages an arbitrarily scalable pretraining stage and can be applied to any quality standard as long as a small number of human annotations are available. Our experiments show that our metric outperforms existing reference-based metrics in predicting overall ratings and reaches competitive and consistent performance in predicting specific ratings while requiring no reference simplifications at inference time.
Paper Structure (20 sections, 1 equation, 1 figure, 5 tables)