Table of Contents
Fetching ...

Mitigating the Language Mismatch and Repetition Issues in LLM-based Machine Translation via Model Editing

Weichuan Wang, Zhaoyi Li, Defu Lian, Chen Ma, Linqi Song, Ying Wei

TL;DR

The work sets out to explore the potential for mitigating language mismatch and repetition issues by leveraging model editing methods, e.g., by locating Feed-Forward Network neurons or something that is responsible for the errors and deactivating them in the inference time, and finds that directly applying such methods have limited effect on the targeted errors.

Abstract

Large Language Models (LLMs) have recently revolutionized the NLP field, while they still fall short in some specific down-stream tasks. In the work, we focus on utilizing LLMs to perform machine translation, where we observe that two patterns of errors frequently occur and drastically affect the translation quality: language mismatch and repetition. The work sets out to explore the potential for mitigating these two issues by leveraging model editing methods, e.g., by locating Feed-Forward Network (FFN) neurons or something that are responsible for the errors and deactivating them in the inference time. We find that directly applying such methods either limited effect on the targeted errors or has significant negative side-effect on the general translation quality, indicating that the located components may also be crucial for ensuring machine translation with LLMs on the rails. To this end, we propose to refine the located components by fetching the intersection of the locating results under different language settings, filtering out the aforementioned information that is irrelevant to targeted errors. The experiment results empirically demonstrate that our methods can effectively reduce the language mismatch and repetition ratios and meanwhile enhance or keep the general translation quality in most cases.

Mitigating the Language Mismatch and Repetition Issues in LLM-based Machine Translation via Model Editing

TL;DR

The work sets out to explore the potential for mitigating language mismatch and repetition issues by leveraging model editing methods, e.g., by locating Feed-Forward Network neurons or something that is responsible for the errors and deactivating them in the inference time, and finds that directly applying such methods have limited effect on the targeted errors.

Abstract

Large Language Models (LLMs) have recently revolutionized the NLP field, while they still fall short in some specific down-stream tasks. In the work, we focus on utilizing LLMs to perform machine translation, where we observe that two patterns of errors frequently occur and drastically affect the translation quality: language mismatch and repetition. The work sets out to explore the potential for mitigating these two issues by leveraging model editing methods, e.g., by locating Feed-Forward Network (FFN) neurons or something that are responsible for the errors and deactivating them in the inference time. We find that directly applying such methods either limited effect on the targeted errors or has significant negative side-effect on the general translation quality, indicating that the located components may also be crucial for ensuring machine translation with LLMs on the rails. To this end, we propose to refine the located components by fetching the intersection of the locating results under different language settings, filtering out the aforementioned information that is irrelevant to targeted errors. The experiment results empirically demonstrate that our methods can effectively reduce the language mismatch and repetition ratios and meanwhile enhance or keep the general translation quality in most cases.

Paper Structure

This paper contains 40 sections, 3 equations, 5 figures, 21 tables.

Figures (5)

  • Figure 1: The illustration of the language mismatch error (a) and the repetition error (b).
  • Figure 2: Heatmaps of AIE values for attention heads in LLaMA2-7B for de$\rightarrow$en setting (a) and en$\rightarrow$zh setting (b). x-axis and y-axis refer to the layer and head. Brighter color refers to the head with larger AIE value.
  • Figure 3: Performance ((a) for the decrease percentage of LMR; (b) for the improvement percentage of COMET22DA) of intervention (blue bars) with language settings of zh$\rightarrow$en, en$\rightarrow$zh and de$\rightarrow$en on the heads located with the language setting of en$\rightarrow$de. The red bars (comparison group) refer to the results for intervention on random heads of the same number.
  • Figure 4: Heatmaps of AIE values for attention heads in LLaMA2-7B for en$\rightarrow$de setting (a), de$\rightarrow$en setting (b), en$\rightarrow$zh setting (c) and zh$\rightarrow$en setting (d). The x-axis and y-axis refer to the layer and head, respectively. Brighter color refers to the head with larger AIE value.
  • Figure 5: Heatmaps of AIE values for attention heads in LLaMA2-13B for en$\rightarrow$de setting (a), de$\rightarrow$en setting (b), en$\rightarrow$zh setting (c) and zh$\rightarrow$en setting (d). The x-axis and y-axis refer to the layer and head, respectively. Brighter color refers to the head with larger AIE value.