Table of Contents
Fetching ...

Character-based Neural Machine Translation

Marta R. Costa-Jussà, José A. R. Fonollosa

TL;DR

This work addresses the vocabulary and morphological challenges of neural machine translation by introducing character-based source word embeddings constructed via a convolutional neural network and highway layers, replacing standard word lookup embeddings. The character-based representations are integrated into an attention-based encoder–decoder framework, yielding an unlimited source vocabulary and better handling of affixes. Empirical results on German–English WMT show BLEU gains up to about 3 points, driven by reduced unknowns and improved morphology handling, with additional gains when postprocessing UNKs. The approach demonstrates a practical path to more scalable and morphologically aware NMT, with potential extension to target-side representations in future work.

Abstract

Neural Machine Translation (MT) has reached state-of-the-art results. However, one of the main challenges that neural MT still faces is dealing with very large vocabularies and morphologically rich languages. In this paper, we propose a neural MT system using character-based embeddings in combination with convolutional and highway layers to replace the standard lookup-based word representations. The resulting unlimited-vocabulary and affix-aware source word embeddings are tested in a state-of-the-art neural MT based on an attention-based bidirectional recurrent neural network. The proposed MT scheme provides improved results even when the source language is not morphologically rich. Improvements up to 3 BLEU points are obtained in the German-English WMT task.

Character-based Neural Machine Translation

TL;DR

This work addresses the vocabulary and morphological challenges of neural machine translation by introducing character-based source word embeddings constructed via a convolutional neural network and highway layers, replacing standard word lookup embeddings. The character-based representations are integrated into an attention-based encoder–decoder framework, yielding an unlimited source vocabulary and better handling of affixes. Empirical results on German–English WMT show BLEU gains up to about 3 points, driven by reduced unknowns and improved morphology handling, with additional gains when postprocessing UNKs. The approach demonstrates a practical path to more scalable and morphologically aware NMT, with potential extension to target-side representations in future work.

Abstract

Neural Machine Translation (MT) has reached state-of-the-art results. However, one of the main challenges that neural MT still faces is dealing with very large vocabularies and morphologically rich languages. In this paper, we propose a neural MT system using character-based embeddings in combination with convolutional and highway layers to replace the standard lookup-based word representations. The resulting unlimited-vocabulary and affix-aware source word embeddings are tested in a state-of-the-art neural MT based on an attention-based bidirectional recurrent neural network. The proposed MT scheme provides improved results even when the source language is not morphologically rich. Improvements up to 3 BLEU points are obtained in the German-English WMT task.

Paper Structure

This paper contains 8 sections, 1 figure, 3 tables.

Figures (1)

  • Figure 1: Character-based word embedding