Lon-ea at SemEval-2023 Task 11: A Comparison of Activation Functions for Soft and Hard Label Prediction

Peyman Hosseini; Mehran Hosseini; Sana Sabah Al-Azzawi; Marcus Liwicki; Ignacio Castro; Matthew Purver

Lon-ea at SemEval-2023 Task 11: A Comparison of Activation Functions for Soft and Hard Label Prediction

Peyman Hosseini, Mehran Hosseini, Sana Sabah Al-Azzawi, Marcus Liwicki, Ignacio Castro, Matthew Purver

TL;DR

The paper investigates how different output-layer activation functions affect soft-label and hard-label prediction in learning with disagreement for SemEval-2023 Task 11. It introduces a novel Sinusoidal Activation SSF, defined by parameters $a$ (annotators) and $\theta$, and compares it with a widened sigmoid and a post-training step function using BERT-based preprocessors/encoders. Soft labels are trained to reflect annotator disagreement, while hard labels are derived by rounding soft-label predictions, evaluated via micro F1-score. Results show SSF often improves hard-label prediction on ArMIS and MD-Agreement, with sigmoid yielding strong soft-label performance on several datasets; the work demonstrates SSF’s potential for tasks with discrete, limited annotator views. These findings motivate applying SSF to other datasets with fixed annotator counts and exploring broader domain applications of disagreement-aware activation functions.

Abstract

We study the influence of different activation functions in the output layer of deep neural network models for soft and hard label prediction in the learning with disagreement task. In this task, the goal is to quantify the amount of disagreement via predicting soft labels. To predict the soft labels, we use BERT-based preprocessors and encoders and vary the activation function used in the output layer, while keeping other parameters constant. The soft labels are then used for the hard label prediction. The activation functions considered are sigmoid as well as a step-function that is added to the model post-training and a sinusoidal activation function, which is introduced for the first time in this paper.

Lon-ea at SemEval-2023 Task 11: A Comparison of Activation Functions for Soft and Hard Label Prediction

TL;DR

(annotators) and

, and compares it with a widened sigmoid and a post-training step function using BERT-based preprocessors/encoders. Soft labels are trained to reflect annotator disagreement, while hard labels are derived by rounding soft-label predictions, evaluated via micro F1-score. Results show SSF often improves hard-label prediction on ArMIS and MD-Agreement, with sigmoid yielding strong soft-label performance on several datasets; the work demonstrates SSF’s potential for tasks with discrete, limited annotator views. These findings motivate applying SSF to other datasets with fixed annotator counts and exploring broader domain applications of disagreement-aware activation functions.

Abstract

Paper Structure (14 sections, 2 equations, 2 figures, 2 tables)

This paper contains 14 sections, 2 equations, 2 figures, 2 tables.

Introduction
Background
Task Setup and Description
System Overview
Soft Label Prediction
Approach 1: Sigmoid Activation
Approach 2: Sinusoidal Activation
Approach 3: Step Function
Hard Label Prediction
Experimental Setup
Results
Soft Evaluation Results
Hard Evaluation Results
Conclusion

Figures (2)

Figure 1: The plot of SSF for $\theta = 0.05$ and $a = 3$.
Figure 2: The discrete step function with $a = 3$.

Lon-ea at SemEval-2023 Task 11: A Comparison of Activation Functions for Soft and Hard Label Prediction

TL;DR

Abstract

Lon-ea at SemEval-2023 Task 11: A Comparison of Activation Functions for Soft and Hard Label Prediction

Authors

TL;DR

Abstract

Table of Contents

Figures (2)