Predicting Word Similarity in Context with Referential Translation Machines

Ergun Biçici

Predicting Word Similarity in Context with Referential Translation Machines

Ergun Biçici

TL;DR

The paper reframes word similarity in context as a machine translation performance prediction (MTPP) problem using Referential Translation Machines (RTMs), enabling a unified training/test representation for context-sensitive similarity scoring. It constructs intra- and inter-context word-pair similarity features (wps) and trains unsupervised models to approximate Graded Word Similarity in Context (GWSC) signals, achieving top results. Beyond GWSC, the approach extends to emotion intensity prediction in multi-language tweets and to identifying discriminative semantic attributes, using stacked RTM models that integrate predictions from multiple MTPP predictors. The work demonstrates the versatility and transferability of RTMs across tasks, supported by a scalable data pipeline with interpretants and extensive feature engineering. Overall, RTMs offer a principled, data-driven path to automatic, context-aware semantic evaluation with strong empirical performance.

Abstract

We identify the similarity between two words in English by casting the task as machine translation performance prediction (MTPP) between the words given the context and the distance between their similarities. We use referential translation machines (RTMs), which allows a common representation for training and test sets and stacked machine learning models. RTMs can achieve the top results in Graded Word Similarity in Context (GWSC) task.

Predicting Word Similarity in Context with Referential Translation Machines

TL;DR

Abstract

Paper Structure (8 sections, 1 equation, 3 figures, 8 tables)

This paper contains 8 sections, 1 equation, 3 figures, 8 tables.

Grading the Similarity of Words within Context
wps Features
Unsupervised Learning of Word Pair Similarity
Predicting the Intensity of the Structure and Content in Tweets
Predicting the Attributes that Discriminate the Semantics
Stacked RTM Models for Predicting the Discriminative Power of Attributes
Conclusion
English Lexicon used from WordNet Affect Emotion Lists

Figures (3)

Figure 1: Intra and inter context wps averages. Intra score averages pairs with words on different sides of the context and inter averages all 3x3.
Figure 2: RTM depiction: parfda selects interpretants close to the data using corpora; two MTPPS use interpretants, training data, and test data to generate features in the same space; learning and prediction use these features as input where spheres represent feature spaces.
Figure 3: RTM with stacked combined prediction use a combined model to obtain feature representations and predictions for $w_1 \rightarrow a$ and $w_2 \rightarrow a$, which are processed before additional learning and prediction.

Predicting Word Similarity in Context with Referential Translation Machines

TL;DR

Abstract

Predicting Word Similarity in Context with Referential Translation Machines

Authors

TL;DR

Abstract

Table of Contents

Figures (3)