Table of Contents
Fetching ...

A Representation Level Analysis of NMT Model Robustness to Grammatical Errors

Abderrahmane Issam, Yusuf Can Semerci, Jan Scholtes, Gerasimos Spanakis

TL;DR

The paper tackles NMT robustness to grammatical errors by examining internal representations through GED probing, representational distance, and Robustness Heads. It shows the encoder behaves like a Grammatical Error Correction system, detecting errors in early layers and correcting them as information moves to deeper layers, especially after fine-tuning for robustness. Fine-tuning only the encoder on noisy data yields substantial robustness gains with minimal impact on clean translation, and robustness relies on specialized attention heads that attend to interpretable POS categories. This work provides a practical, model-agnostic framework for auditing robustness across languages and suggests a targeted, efficient path to improve NMT reliability in noisy real-world inputs.

Abstract

Understanding robustness is essential for building reliable NLP systems. Unfortunately, in the context of machine translation, previous work mainly focused on documenting robustness failures or improving robustness. In contrast, we study robustness from a model representation perspective by looking at internal model representations of ungrammatical inputs and how they evolve through model layers. For this purpose, we perform Grammatical Error Detection (GED) probing and representational similarity analysis. Our findings indicate that the encoder first detects the grammatical error, then corrects it by moving its representation toward the correct form. To understand what contributes to this process, we turn to the attention mechanism where we identify what we term Robustness Heads. We find that Robustness Heads attend to interpretable linguistic units when responding to grammatical errors, and that when we fine-tune models for robustness, they tend to rely more on Robustness Heads for updating the ungrammatical word representation.

A Representation Level Analysis of NMT Model Robustness to Grammatical Errors

TL;DR

The paper tackles NMT robustness to grammatical errors by examining internal representations through GED probing, representational distance, and Robustness Heads. It shows the encoder behaves like a Grammatical Error Correction system, detecting errors in early layers and correcting them as information moves to deeper layers, especially after fine-tuning for robustness. Fine-tuning only the encoder on noisy data yields substantial robustness gains with minimal impact on clean translation, and robustness relies on specialized attention heads that attend to interpretable POS categories. This work provides a practical, model-agnostic framework for auditing robustness across languages and suggests a targeted, efficient path to improve NMT reliability in noisy real-world inputs.

Abstract

Understanding robustness is essential for building reliable NLP systems. Unfortunately, in the context of machine translation, previous work mainly focused on documenting robustness failures or improving robustness. In contrast, we study robustness from a model representation perspective by looking at internal model representations of ungrammatical inputs and how they evolve through model layers. For this purpose, we perform Grammatical Error Detection (GED) probing and representational similarity analysis. Our findings indicate that the encoder first detects the grammatical error, then corrects it by moving its representation toward the correct form. To understand what contributes to this process, we turn to the attention mechanism where we identify what we term Robustness Heads. We find that Robustness Heads attend to interpretable linguistic units when responding to grammatical errors, and that when we fine-tune models for robustness, they tend to rely more on Robustness Heads for updating the ungrammatical word representation.

Paper Structure

This paper contains 44 sections, 15 figures, 4 tables.

Figures (15)

  • Figure 1: GED probing performance of Noise-Finetuned, Clean-Finetuned and Base models on En-Es. GED probing performance of Noise-Finetuned models witnesses a degradation in deeper layers.
  • Figure 2: CKA distance of clean and noise word representations across models and errors on En-Es. Noise-Finetuned models drive the representation of the noisy word to be more similar to the clean word.
  • Figure 3: Robustness Heads attention to the 10 most common POS tags in the test set on En-Es. The scale of attention is relative to each base model and error. We highlight POS tags that are attended to the most across models.
  • Figure 4: Accuracy of Robustness and Influential heads on En-Es. We find the accuracy is higher for Noise-Finetuned models especially in deep layers.
  • Figure 5: COMET difference between performance on clean and noisy test sets after fine-tuning different parts of the models. We show the average across 3 multilingual models (M2M100, MBART, NLLB) and bilingual models (OPUS-MT for each language pair). Fine-tuning the encoder is almost as good as fine-tuning the full model.
  • ...and 10 more figures