Table of Contents
Fetching ...

mEdIT: Multilingual Text Editing via Instruction Tuning

Vipul Raheja, Dimitris Alikaniotis, Vivek Kulkarni, Bashar Alhafni, Dhruv Kumar

TL;DR

mEdIT tackles multilingual text editing by training instruction-tuned multilingual LLMs on curated, task-diverse data for grammatical correction, paraphrasing, and simplification across seven languages. It combines encoder–decoder and decoder-only architectures and evaluates a broad spectrum of multilingual models against baselines, showing that task- and language-specific fine-tuning yields strong, cross-language editing performance. The study reveals how model scale, data mixtures, and instruction language impact results, and demonstrates generalization to languages unseen during fine-tuning. By releasing data, code, and models, the work advances multilingual intelligent writing assistants and highlights directions for broader language coverage and improved evaluation of editing quality.

Abstract

We introduce mEdIT, a multi-lingual extension to CoEdIT -- the recent state-of-the-art text editing models for writing assistance. mEdIT models are trained by fine-tuning multi-lingual large, pre-trained language models (LLMs) via instruction tuning. They are designed to take instructions from the user specifying the attributes of the desired text in the form of natural language instructions, such as Grammatik korrigieren (German) or Parafrasee la oración (Spanish). We build mEdIT by curating data from multiple publicly available human-annotated text editing datasets for three text editing tasks (Grammatical Error Correction (GEC), Text Simplification, and Paraphrasing) across diverse languages belonging to six different language families. We detail the design and training of mEdIT models and demonstrate their strong performance on many multi-lingual text editing benchmarks against other multilingual LLMs. We also find that mEdIT generalizes effectively to new languages over multilingual baselines. We publicly release our data, code, and trained models at https://github.com/vipulraheja/medit.

mEdIT: Multilingual Text Editing via Instruction Tuning

TL;DR

mEdIT tackles multilingual text editing by training instruction-tuned multilingual LLMs on curated, task-diverse data for grammatical correction, paraphrasing, and simplification across seven languages. It combines encoder–decoder and decoder-only architectures and evaluates a broad spectrum of multilingual models against baselines, showing that task- and language-specific fine-tuning yields strong, cross-language editing performance. The study reveals how model scale, data mixtures, and instruction language impact results, and demonstrates generalization to languages unseen during fine-tuning. By releasing data, code, and models, the work advances multilingual intelligent writing assistants and highlights directions for broader language coverage and improved evaluation of editing quality.

Abstract

We introduce mEdIT, a multi-lingual extension to CoEdIT -- the recent state-of-the-art text editing models for writing assistance. mEdIT models are trained by fine-tuning multi-lingual large, pre-trained language models (LLMs) via instruction tuning. They are designed to take instructions from the user specifying the attributes of the desired text in the form of natural language instructions, such as Grammatik korrigieren (German) or Parafrasee la oración (Spanish). We build mEdIT by curating data from multiple publicly available human-annotated text editing datasets for three text editing tasks (Grammatical Error Correction (GEC), Text Simplification, and Paraphrasing) across diverse languages belonging to six different language families. We detail the design and training of mEdIT models and demonstrate their strong performance on many multi-lingual text editing benchmarks against other multilingual LLMs. We also find that mEdIT generalizes effectively to new languages over multilingual baselines. We publicly release our data, code, and trained models at https://github.com/vipulraheja/medit.
Paper Structure (63 sections, 11 figures, 8 tables)

This paper contains 63 sections, 11 figures, 8 tables.

Figures (11)

  • Figure 1: Examples illustrating multilingual and cross-lingual text editing. The editing instructions are described in bold. Note that the input and output texts are always in the same language. The monolingual vs. cross-lingual setting is determined by comparing the language of the edit instruction to the language of the input text.
  • Figure 2: Data distribution for each of the three tasks and seven languages on which we train. The amount of data is shown in a log scale to aid visualization.
  • Figure 3: Overall performance comparison of all baselines against trained models. We calculate the aggregated performance across all tasks using the harmonic mean of task-specific scores. Baselines are "No Edits (Copy)", "English-only" ("-En,"), and our trained models are marked as "CLM" and "Seq2Seq," respectively. The aggregated performance is calculated as described in \ref{['sec:text_editing_quality']}.
  • Figure 4: Aggregated performance on different tasks broken down by instruction language. Apart from some minor fluctuation, there is no significant impact of instruction language on our results.
  • Figure 5: Aggregated model performance by language (for GEC, Paraphrasing, and Simplification). For each task, we aggregate the relevant metrics as described in \ref{['sec:text_editing_quality']} and split them by model training.
  • ...and 6 more figures