Table of Contents
Fetching ...

Aligning Sentence Simplification with ESL Learner's Proficiency for Language Acquisition

Guanlin Li, Yuki Arase, Noel Crespi

TL;DR

This work tackles ESL sentence simplification by aligning outputs with CEFR proficiency and boosting target-level vocabulary without relying on parallel corpora. It introduces a reinforcement learning framework on a pre-trained LLM that performs lookahead decoding under Disjunctive Normal Form vocabulary constraints, aiming to maximize target vocabulary coverage while ensuring the final sentence matches the learner’s level. Two reward models—Lexical Constraint Reward with a dynamic, entropy-driven objective and a Sentence-Level Reward via pairwise CEFR judgments—drive training, with stabilized PPO updates and entropy regularization from a frozen reference model. Evaluations on CEFR-SP-Test and TurkCorpus show up to a $20\%$ increase in target vocabulary coverage and strong simplification quality, confirmed by human judgments; the approach offers a practical path to ESL acquisition tools without costly parallel data.

Abstract

Text simplification is crucial for improving accessibility and comprehension for English as a Second Language (ESL) learners. This study goes a step further and aims to facilitate ESL learners' language acquisition by simplification. Specifically, we propose simplifying complex sentences to appropriate levels for learners while also increasing vocabulary coverage of the target level in the simplifications. We achieve this without a parallel corpus by conducting reinforcement learning on a large language model. Our method employs token-level and sentence-level rewards, and iteratively trains the model on its self-generated outputs to guide the model to search for simplification hypotheses that satisfy the target attributes. Experiment results on CEFR-SP and TurkCorpus datasets show that the proposed method can effectively increase the frequency and diversity of vocabulary of the target level by more than $20\%$ compared to baseline models, while maintaining high simplification quality.

Aligning Sentence Simplification with ESL Learner's Proficiency for Language Acquisition

TL;DR

This work tackles ESL sentence simplification by aligning outputs with CEFR proficiency and boosting target-level vocabulary without relying on parallel corpora. It introduces a reinforcement learning framework on a pre-trained LLM that performs lookahead decoding under Disjunctive Normal Form vocabulary constraints, aiming to maximize target vocabulary coverage while ensuring the final sentence matches the learner’s level. Two reward models—Lexical Constraint Reward with a dynamic, entropy-driven objective and a Sentence-Level Reward via pairwise CEFR judgments—drive training, with stabilized PPO updates and entropy regularization from a frozen reference model. Evaluations on CEFR-SP-Test and TurkCorpus show up to a increase in target vocabulary coverage and strong simplification quality, confirmed by human judgments; the approach offers a practical path to ESL acquisition tools without costly parallel data.

Abstract

Text simplification is crucial for improving accessibility and comprehension for English as a Second Language (ESL) learners. This study goes a step further and aims to facilitate ESL learners' language acquisition by simplification. Specifically, we propose simplifying complex sentences to appropriate levels for learners while also increasing vocabulary coverage of the target level in the simplifications. We achieve this without a parallel corpus by conducting reinforcement learning on a large language model. Our method employs token-level and sentence-level rewards, and iteratively trains the model on its self-generated outputs to guide the model to search for simplification hypotheses that satisfy the target attributes. Experiment results on CEFR-SP and TurkCorpus datasets show that the proposed method can effectively increase the frequency and diversity of vocabulary of the target level by more than compared to baseline models, while maintaining high simplification quality.

Paper Structure

This paper contains 41 sections, 10 equations, 6 figures, 11 tables, 1 algorithm.

Figures (6)

  • Figure 1: (better viewed in color) The overall framework of the proposed method: the simplification model is initialized from a pretrained large language model which is also used as a frozen () reference model to provide entropy regularization (part 0.); top-k sampling is adopted in the decoding process to sample varied simplifications for the complex sentence (part 1.a.); the generated simplifications are evaluated based on the language proficiency level (vocabulary level and sentence level) of the target audience, which is used as rewards to update the simplification model (part 1.b.) to adopt better decoding strategy.
  • Figure 2: Reward effects on target vocabulary coverage
  • Figure 3: Sentence level reward model evaluation accuracy
  • Figure 4: Training stability w/wo dynamic reward
  • Figure 5: Screenshot of annotation guidelines
  • ...and 1 more figures