Table of Contents
Fetching ...

adaptNMT: an open-source, language-agnostic development environment for Neural Machine Translation

Séamus Lankford, Haithem Afli, Andy Way

TL;DR

The application streamlines all processes involved in the development and deployment of RNN and Transformer neural translation models and is particularly useful for new entrants to the field since the setup of the development environment and creation of train, validation and test splits is greatly simplified.

Abstract

adaptNMT streamlines all processes involved in the development and deployment of RNN and Transformer neural translation models. As an open-source application, it is designed for both technical and non-technical users who work in the field of machine translation. Built upon the widely-adopted OpenNMT ecosystem, the application is particularly useful for new entrants to the field since the setup of the development environment and creation of train, validation and test splits is greatly simplified. Graphing, embedded within the application, illustrates the progress of model training, and SentencePiece is used for creating subword segmentation models. Hyperparameter customization is facilitated through an intuitive user interface, and a single-click model development approach has been implemented. Models developed by adaptNMT can be evaluated using a range of metrics, and deployed as a translation service within the application. To support eco-friendly research in the NLP space, a green report also flags the power consumption and kgCO$_{2}$ emissions generated during model development. The application is freely available.

adaptNMT: an open-source, language-agnostic development environment for Neural Machine Translation

TL;DR

The application streamlines all processes involved in the development and deployment of RNN and Transformer neural translation models and is particularly useful for new entrants to the field since the setup of the development environment and creation of train, validation and test splits is greatly simplified.

Abstract

adaptNMT streamlines all processes involved in the development and deployment of RNN and Transformer neural translation models. As an open-source application, it is designed for both technical and non-technical users who work in the field of machine translation. Built upon the widely-adopted OpenNMT ecosystem, the application is particularly useful for new entrants to the field since the setup of the development environment and creation of train, validation and test splits is greatly simplified. Graphing, embedded within the application, illustrates the progress of model training, and SentencePiece is used for creating subword segmentation models. Hyperparameter customization is facilitated through an intuitive user interface, and a single-click model development approach has been implemented. Models developed by adaptNMT can be evaluated using a range of metrics, and deployed as a translation service within the application. To support eco-friendly research in the NLP space, a green report also flags the power consumption and kgCO emissions generated during model development. The application is freely available.
Paper Structure (28 sections, 8 equations, 10 figures, 8 tables)

This paper contains 28 sections, 8 equations, 10 figures, 8 tables.

Figures (10)

  • Figure 1: The Transformer architecture using an encoder-decoder vaswani2017attention. The encoder maps an input sequence to the decoder. The decoder generates a new output by combining the encoder output with the decoder output from the previous step.
  • Figure 2: Multi-Head Attention in the Decoder vaswani2017attention. In the decoder, a multi-head layer receives queries from the previous decoder sublayer, and the keys and values from the encoder output. The decoder can now attend to all words in the input sequence.
  • Figure 3: Neurons within an RNN. At the input side, the neuron's input at time $t$ is a function of the encoded word (i.e. input vector $x_t$) and a hidden state vector $h_{t-1}$ which contains the previous sequence. The output generated by the neuron is represented by the vector $O_t$.
  • Figure 4: Encoder-decoder architecture. The encoder encodes the entire input sequence into a fixed-length context vector, $c$, by processing input time steps. The function of the decoder is to read this context vector while stepping through output time steps.
  • Figure 5: Beam Search Algorithm yang2020survey
  • ...and 5 more figures