Automatic design optimization of preference-based subjective evaluation with online learning in crowdsourcing environment

Yusuke Yasuda; Tomoki Toda

Automatic design optimization of preference-based subjective evaluation with online learning in crowdsourcing environment

Yusuke Yasuda, Tomoki Toda

TL;DR

This work proposes an automatic optimization method for preference-based subjective evaluation in terms of pair combination selections and allocation of evaluation volumes with online learning in a crowdsourcing environment based on a sorting algorithm.

Abstract

A preference-based subjective evaluation is a key method for evaluating generative media reliably. However, its huge combinations of pairs prohibit it from being applied to large-scale evaluation using crowdsourcing. To address this issue, we propose an automatic optimization method for preference-based subjective evaluation in terms of pair combination selections and allocation of evaluation volumes with online learning in a crowdsourcing environment. We use a preference-based online learning method based on a sorting algorithm to identify the total order of evaluation targets with minimum sample volumes. Our online learning algorithm supports parallel and asynchronous execution under fixed-budget conditions required for crowdsourcing. Our experiment on preference-based subjective evaluation of synthetic speech shows that our method successfully optimizes the test by reducing pair combinations from 351 to 83 and allocating optimal evaluation volumes for each pair ranging from 30 to 663 without compromising evaluation accuracies and wasting budget allocations.

Automatic design optimization of preference-based subjective evaluation with online learning in crowdsourcing environment

TL;DR

Abstract

Paper Structure (11 sections, 7 figures, 1 table, 4 algorithms)

This paper contains 11 sections, 7 figures, 1 table, 4 algorithms.

Introduction
Issues in subjective evaluation
Dynamic optimization of preference test
An idea of dynamic preference test
MERGE-RANK algorithm
Online learning in crowdsourcing environment
Experimental evaluation
Experimental conditions
Experimental results
Conclusion
Acknowledgements

Figures (7)

Figure 1: An overview of dynamic preference test.
Figure 2: The winner estimation based on the statistical test.
Figure 3: Problems for online learning in asynchronous execution. This figure shows a situation where over-evaluation and unbalanced evaluation allocation occurs.
Figure 4: The final and initial order obtained with the modified MRA in a preference test. The initial order is based on the pre-recorded MOS. Adjacent pairs without statistical significance are colored in the same color.
Figure 5: Distributions of metrics from pairwise evaluations.
...and 2 more figures

Automatic design optimization of preference-based subjective evaluation with online learning in crowdsourcing environment

TL;DR

Abstract

Automatic design optimization of preference-based subjective evaluation with online learning in crowdsourcing environment

Authors

TL;DR

Abstract

Table of Contents

Figures (7)