Table of Contents
Fetching ...

Analysis of Levenshtein Transformer's Decoder and Its Variants

Ruiyang Zhou

TL;DR

This report focuses on LevT's decoder and analyse the decoding results length, subword generation, and deletion module's capability, and hopes to identify weaknesses of the decoder for future improvements.

Abstract

Levenshtein transformer (LevT) is a non-autoregressive machine translation model with high decoding efficiency and comparable translation quality in terms of bleu score, due to its parallel decoding and iterative refinement procedure. Are there any deficiencies of its translations and what improvements could be made? In this report, we focus on LevT's decoder and analyse the decoding results length, subword generation, and deletion module's capability. We hope to identify weaknesses of the decoder for future improvements. We also compare translations of the original LevT, knowledge-distilled LevT, LevT with translation memory, and the KD-LevT with translation memory to see how KD and translation memory can help.

Analysis of Levenshtein Transformer's Decoder and Its Variants

TL;DR

This report focuses on LevT's decoder and analyse the decoding results length, subword generation, and deletion module's capability, and hopes to identify weaknesses of the decoder for future improvements.

Abstract

Levenshtein transformer (LevT) is a non-autoregressive machine translation model with high decoding efficiency and comparable translation quality in terms of bleu score, due to its parallel decoding and iterative refinement procedure. Are there any deficiencies of its translations and what improvements could be made? In this report, we focus on LevT's decoder and analyse the decoding results length, subword generation, and deletion module's capability. We hope to identify weaknesses of the decoder for future improvements. We also compare translations of the original LevT, knowledge-distilled LevT, LevT with translation memory, and the KD-LevT with translation memory to see how KD and translation memory can help.
Paper Structure (27 sections, 8 figures, 13 tables)

This paper contains 27 sections, 8 figures, 13 tables.

Figures (8)

  • Figure 1: LevT structure when inferencing.
  • Figure 2: LevT structure when training.
  • Figure 3: Pld prediction accuracy of four models
  • Figure 4: Precision and recall of subword/complete word token prediction.
  • Figure 5: Word level bleu score with decoding initialization missing only subword or only complete word.
  • ...and 3 more figures