MetaMetrics-MT: Tuning Meta-Metrics for Machine Translation via Human Preference Calibration

David Anugraha; Garry Kuwanto; Lucky Susanto; Derry Tanti Wijaya; Genta Indra Winata

MetaMetrics-MT: Tuning Meta-Metrics for Machine Translation via Human Preference Calibration

David Anugraha, Garry Kuwanto, Lucky Susanto, Derry Tanti Wijaya, Genta Indra Winata

TL;DR

MetaMetrics-MT outperforms all existing baselines, setting a new benchmark for state-of-the-art performance in the reference-based setting and achieves comparable results to leading metrics in the reference-free setting, offering greater efficiency.

Abstract

We present MetaMetrics-MT, an innovative metric designed to evaluate machine translation (MT) tasks by aligning closely with human preferences through Bayesian optimization with Gaussian Processes. MetaMetrics-MT enhances existing MT metrics by optimizing their correlation with human judgments. Our experiments on the WMT24 metric shared task dataset demonstrate that MetaMetrics-MT outperforms all existing baselines, setting a new benchmark for state-of-the-art performance in the reference-based setting. Furthermore, it achieves comparable results to leading metrics in the reference-free setting, offering greater efficiency.

MetaMetrics-MT: Tuning Meta-Metrics for Machine Translation via Human Preference Calibration

TL;DR

Abstract

MetaMetrics-MT: Tuning Meta-Metrics for Machine Translation via Human Preference Calibration

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (1)