Table of Contents
Fetching ...

Subjective Logic Encodings

Jake Vasilakes, Chrysoula Zerva, Sophia Ananiadou

TL;DR

Subjective Logic Encodings (SLEs) provide a principled framework to treat annotations as subjective opinions rather than gold facts, encoding them as Dirichlet target distributions over $K$ classes with parameters $\boldsymbol{\alpha}^{(i)}_m = \frac{2\boldsymbol{b}^{(i)}_m}{u^{(i)}_m} + K\boldsymbol{a}$ and expectation $\mathbb{E}[\omega^{(i)}_m] = \boldsymbol{b}^{(i)}_m + u^{(i)}_m \boldsymbol{a}$. Opinions are merged via cumulative belief fusion $\oplus$ and trust discounting $\otimes$ to form aggregated targets $\omega^{(i)}_{\lozenge}$, which the model learns to predict by distribution matching using KL divergences, with smoothing $\epsilon$ and reverse KL to stabilize optimization. Synthetic and real-world NLP/CV experiments show SLEs generalize traditional encodings while capturing disagreement, reliability, and confidence when available, often matching or exceeding baselines. In short, SLEs offer a flexible, scalable way to learn from annotation uncertainty, enabling more faithful models in subjective tasks where gold-standard labels are elusive or misleading.

Abstract

Many existing approaches for learning from labeled data assume the existence of gold-standard labels. According to these approaches, inter-annotator disagreement is seen as noise to be removed, either through refinement of annotation guidelines, label adjudication, or label filtering. However, annotator disagreement can rarely be totally eradicated, especially on more subjective tasks such as sentiment analysis or hate speech detection where disagreement is natural. Therefore, a new approach to learning from labeled data, called data perspectivism, seeks to leverage inter-annotator disagreement to learn models that stay true to the inherent uncertainty of the task by treating annotations as opinions of the annotators, rather than gold-standard facts. Despite this conceptual grounding, existing methods under data perspectivism are limited to using disagreement as the sole source of annotation uncertainty. To expand the possibilities of data perspectivism, we introduce Subjective Logic Encodings (SLEs), a flexible framework for constructing classification targets that explicitly encodes annotations as opinions of the annotators. Based on Subjective Logic Theory, SLEs encode labels as Dirichlet distributions and provide principled methods for encoding and aggregating various types of annotation uncertainty -- annotator confidence, reliability, and disagreement -- into the targets. We show that SLEs are a generalization of other types of label encodings as well as how to estimate models to predict SLEs using a distribution matching objective.

Subjective Logic Encodings

TL;DR

Subjective Logic Encodings (SLEs) provide a principled framework to treat annotations as subjective opinions rather than gold facts, encoding them as Dirichlet target distributions over classes with parameters and expectation . Opinions are merged via cumulative belief fusion and trust discounting to form aggregated targets , which the model learns to predict by distribution matching using KL divergences, with smoothing and reverse KL to stabilize optimization. Synthetic and real-world NLP/CV experiments show SLEs generalize traditional encodings while capturing disagreement, reliability, and confidence when available, often matching or exceeding baselines. In short, SLEs offer a flexible, scalable way to learn from annotation uncertainty, enabling more faithful models in subjective tasks where gold-standard labels are elusive or misleading.

Abstract

Many existing approaches for learning from labeled data assume the existence of gold-standard labels. According to these approaches, inter-annotator disagreement is seen as noise to be removed, either through refinement of annotation guidelines, label adjudication, or label filtering. However, annotator disagreement can rarely be totally eradicated, especially on more subjective tasks such as sentiment analysis or hate speech detection where disagreement is natural. Therefore, a new approach to learning from labeled data, called data perspectivism, seeks to leverage inter-annotator disagreement to learn models that stay true to the inherent uncertainty of the task by treating annotations as opinions of the annotators, rather than gold-standard facts. Despite this conceptual grounding, existing methods under data perspectivism are limited to using disagreement as the sole source of annotation uncertainty. To expand the possibilities of data perspectivism, we introduce Subjective Logic Encodings (SLEs), a flexible framework for constructing classification targets that explicitly encodes annotations as opinions of the annotators. Based on Subjective Logic Theory, SLEs encode labels as Dirichlet distributions and provide principled methods for encoding and aggregating various types of annotation uncertainty -- annotator confidence, reliability, and disagreement -- into the targets. We show that SLEs are a generalization of other types of label encodings as well as how to estimate models to predict SLEs using a distribution matching objective.

Paper Structure

This paper contains 26 sections, 12 equations, 1 figure, 4 tables, 1 algorithm.

Figures (1)

  • Figure 1: F1, JSD, and NES results for Majority Voting (MV), Soft voting (Soft), and SLE cumulative fusion (Fused) given both all and filtered annotations on the synthetic data across different ranges of uncertainty. Results using CrowdTruth were nearly identical to Soft, and so omitted here for clarity.