Sparse Regression for Machine Translation

Ergun Biçici

Sparse Regression for Machine Translation

Ergun Biçici

TL;DR

This work introduces RegMT, a transductive regression framework that learns sparse mappings between source and target feature sets from parallel corpora to aid machine translation. By leveraging $L_1$ regularization (lasso) and a novel dice-based instance selection, RegMT achieves stronger target-feature estimation and translation quality than $L_2$ approaches, particularly in graph-based decoding tasks. The study demonstrates encouraging German–English and Spanish–English results and even replaces a phrase table with learned mappings, highlighting potential benefits in low-resource or constrained decoding scenarios. Overall, the approach offers a model- and data-efficient alternative to conventional phrase-table-based MT, with a measurable link between $F_1$ optimization and BLEU improvements.

Abstract

We use transductive regression techniques to learn mappings between source and target features of given parallel corpora and use these mappings to generate machine translation outputs. We show the effectiveness of $L_1$ regularized regression (\textit{lasso}) to learn the mappings between sparsely observed feature sets versus $L_2$ regularized regression. Proper selection of training instances plays an important role to learn correct feature mappings within limited computational resources and at expected accuracy levels. We introduce \textit{dice} instance selection method for proper selection of training instances, which plays an important role to learn correct feature mappings for improving the source and target coverage of the training set. We show that $L_1$ regularized regression performs better than $L_2$ regularized regression both in regression measurements and in the translation experiments using graph decoding. We present encouraging results when translating from German to English and Spanish to English. We also demonstrate results when the phrase table of a phrase-based decoder is replaced with the mappings we find with the regression model.

Sparse Regression for Machine Translation

TL;DR

This work introduces RegMT, a transductive regression framework that learns sparse mappings between source and target feature sets from parallel corpora to aid machine translation. By leveraging

regularization (lasso) and a novel dice-based instance selection, RegMT achieves stronger target-feature estimation and translation quality than

approaches, particularly in graph-based decoding tasks. The study demonstrates encouraging German–English and Spanish–English results and even replaces a phrase table with learned mappings, highlighting potential benefits in low-resource or constrained decoding scenarios. Overall, the approach offers a model- and data-efficient alternative to conventional phrase-table-based MT, with a measurable link between

optimization and BLEU improvements.

Abstract

regularized regression (\textit{lasso}) to learn the mappings between sparsely observed feature sets versus

regularized regression. Proper selection of training instances plays an important role to learn correct feature mappings within limited computational resources and at expected accuracy levels. We introduce \textit{dice} instance selection method for proper selection of training instances, which plays an important role to learn correct feature mappings for improving the source and target coverage of the training set. We show that

regularized regression performs better than

regularized regression both in regression measurements and in the translation experiments using graph decoding. We present encouraging results when translating from German to English and Spanish to English. We also demonstrate results when the phrase table of a phrase-based decoder is replaced with the mappings we find with the regression model.

Paper Structure (12 sections, 7 equations, 4 figures, 4 tables)

This paper contains 12 sections, 7 equations, 4 figures, 4 tables.

Introduction
Machine Translation Using Regression
$L_1$ Regularized Regression
Instance Selection
Regression Experiments
Tuning for Target $F_1$
Target Feature Estimation as Classification
Regression Results
Graph Decoding Experiments
Decoding Experiments with Moses
Results
Contributions

Figures (4)

Figure 1: String-to-string mapping.
Figure 2: de-en translation results for increasing $m$ using $2$-grams.
Figure 3: es-en translation results for increasing $m$ using $1$&$2$-grams.
Figure 4: Moses de-en translation results with RegMT W used as the phrase table.

Sparse Regression for Machine Translation

TL;DR

Abstract

Sparse Regression for Machine Translation

Authors

TL;DR

Abstract

Table of Contents

Figures (4)