Table of Contents
Fetching ...

Embracing Diversity: A Multi-Perspective Approach with Soft Labels

Benedetta Muscato, Praveen Bushipaka, Gizem Gezici, Lucia Passaro, Fosca Giannotti, Tommaso Cucinotta

TL;DR

This paper addresses stance detection on controversial topics by embracing annotator diversity through perspectivism. It introduces a multi-stage framework that augments data with document summaries and annotations from humans and LLMs, and compares a baseline hard-label approach with a multi-perspective soft-label approach. Empirical results show that soft labels improve performance, with human annotations (HD) outperforming LLM annotations (LLMD), and calibration revealing nuanced effects on model confidence. The work advances responsible NLP by promoting perspective-aware modeling and sharing disaggregated datasets to capture diverse viewpoints in subjective tasks.

Abstract

Prior studies show that adopting the annotation diversity shaped by different backgrounds and life experiences and incorporating them into the model learning, i.e. multi-perspective approach, contribute to the development of more responsible models. Thus, in this paper we propose a new framework for designing and further evaluating perspective-aware models on stance detection task,in which multiple annotators assign stances based on a controversial topic. We also share a new dataset established through obtaining both human and LLM annotations. Results show that the multi-perspective approach yields better classification performance (higher F1-scores), outperforming the traditional approaches that use a single ground-truth, while displaying lower model confidence scores, probably due to the high level of subjectivity of the stance detection task.

Embracing Diversity: A Multi-Perspective Approach with Soft Labels

TL;DR

This paper addresses stance detection on controversial topics by embracing annotator diversity through perspectivism. It introduces a multi-stage framework that augments data with document summaries and annotations from humans and LLMs, and compares a baseline hard-label approach with a multi-perspective soft-label approach. Empirical results show that soft labels improve performance, with human annotations (HD) outperforming LLM annotations (LLMD), and calibration revealing nuanced effects on model confidence. The work advances responsible NLP by promoting perspective-aware modeling and sharing disaggregated datasets to capture diverse viewpoints in subjective tasks.

Abstract

Prior studies show that adopting the annotation diversity shaped by different backgrounds and life experiences and incorporating them into the model learning, i.e. multi-perspective approach, contribute to the development of more responsible models. Thus, in this paper we propose a new framework for designing and further evaluating perspective-aware models on stance detection task,in which multiple annotators assign stances based on a controversial topic. We also share a new dataset established through obtaining both human and LLM annotations. Results show that the multi-perspective approach yields better classification performance (higher F1-scores), outperforming the traditional approaches that use a single ground-truth, while displaying lower model confidence scores, probably due to the high level of subjectivity of the stance detection task.

Paper Structure

This paper contains 28 sections, 2 equations, 4 figures, 4 tables.

Figures (4)

  • Figure 1: The multi-perspective framework for the stance detection task includes the dataset preparation phase with the summarization and further augmentation steps via obtaining LLM annotations. Then, annotations are transformed into hard and soft labels, model fine-tuning is fulfilled and classifier's final prediction scores are calibrated.
  • Figure 2: GPT4-Turbo prompts for text summarization (a) and for LLM annotations (b).
  • Figure 3: Olmo-7b and Llama 3-8b label distribution (train, val, test)
  • Figure 4: Mistral-7b and Majority label LLM distribution (train, val, test)