PyMarian: Fast Neural Machine Translation and Evaluation in Python
Thamme Gowda, Roman Grundkiewicz, Elijah Rippeth, Matt Post, Marcin Junczys-Dowmunt
TL;DR
PyMarian provides Python bindings that expose Marian NMT's fast C++ inference and training to Python, enabling seamless integration with Python ecosystems. It introduces pymarian-eval for fast MT evaluation by reusing converted COMET and BLEURT checkpoints, achieving up to $7.8\times$ speedups on multi-GPU setups and up to $44\times$ faster model loading. The work demonstrates broad applicability through Jupyter notebooks, OPUS-MT decoding, and a Flask web demo, and compares favorably against native metric toolchains in both speed and fidelity. Overall, PyMarian lowers the barrier to using Marian's speed at scale from Python, facilitating rapid experimentation, evaluation, and deployment in research and production contexts.
Abstract
The deep learning language of choice these days is Python; measured by factors such as available libraries and technical support, it is hard to beat. At the same time, software written in lower-level programming languages like C++ retain advantages in speed. We describe a Python interface to Marian NMT, a C++-based training and inference toolkit for sequence-to-sequence models, focusing on machine translation. This interface enables models trained with Marian to be connected to the rich, wide range of tools available in Python. A highlight of the interface is the ability to compute state-of-the-art COMET metrics from Python but using Marian's inference engine, with a speedup factor of up to 7.8$\times$ the existing implementations. We also briefly spotlight a number of other integrations, including Jupyter notebooks, connection with prebuilt models, and a web app interface provided with the package. PyMarian is available in PyPI via $\texttt{pip install pymarian}$.
