Table of Contents
Fetching ...

HYBRINFOX at CheckThat! 2024 -- Task 2: Enriching BERT Models with the Expert System VAGO for Subjectivity Detection

Morgane Casanova, Julien Chanson, Benjamin Icard, Géraud Faye, Guillaume Gadek, Guillaume Gravier, Paul Égré

TL;DR

HYBRINFOX addresses subjectivity detection by blending a fine-tuned RoBERTa classifier with a frozen sBERT semantic encoder and VAGO-based lexical scores. Key formulations include $R_{vagueness}(\phi)=\frac{|V|_{\phi}}{N_{\phi}}$ and $R_{subjectivity}(\phi)=\frac{|S|_{\phi}}{N_{\phi}}$, which are fused with neural representations to improve classification. On English data, the approach achieves a macro F1 of 0.7442, outperforming the baseline, while multilingual results depend on translation quality; the authors advocate language-specific VAGO lexicons to reduce translation dependence. Overall, the work demonstrates the value of integrating explainable symbolic cues with deep models for robust subjectivity and vagueness detection, and outlines concrete future directions in multilingual lexicon development and translation-aware design.

Abstract

This paper presents the HYBRINFOX method used to solve Task 2 of Subjectivity detection of the CLEF 2024 CheckThat! competition. The specificity of the method is to use a hybrid system, combining a RoBERTa model, fine-tuned for subjectivity detection, a frozen sentence-BERT (sBERT) model to capture semantics, and several scores calculated by the English version of the expert system VAGO, developed independently of this task to measure vagueness and subjectivity in texts based on the lexicon. In English, the HYBRINFOX method ranked 1st with a macro F1 score of 0.7442 on the evaluation data. For the other languages, the method used a translation step into English, producing more mixed results (ranking 1st in Multilingual and 2nd in Italian over the baseline, but under the baseline in Bulgarian, German, and Arabic). We explain the principles of our hybrid approach, and outline ways in which the method could be improved for other languages besides English.

HYBRINFOX at CheckThat! 2024 -- Task 2: Enriching BERT Models with the Expert System VAGO for Subjectivity Detection

TL;DR

HYBRINFOX addresses subjectivity detection by blending a fine-tuned RoBERTa classifier with a frozen sBERT semantic encoder and VAGO-based lexical scores. Key formulations include and , which are fused with neural representations to improve classification. On English data, the approach achieves a macro F1 of 0.7442, outperforming the baseline, while multilingual results depend on translation quality; the authors advocate language-specific VAGO lexicons to reduce translation dependence. Overall, the work demonstrates the value of integrating explainable symbolic cues with deep models for robust subjectivity and vagueness detection, and outlines concrete future directions in multilingual lexicon development and translation-aware design.

Abstract

This paper presents the HYBRINFOX method used to solve Task 2 of Subjectivity detection of the CLEF 2024 CheckThat! competition. The specificity of the method is to use a hybrid system, combining a RoBERTa model, fine-tuned for subjectivity detection, a frozen sentence-BERT (sBERT) model to capture semantics, and several scores calculated by the English version of the expert system VAGO, developed independently of this task to measure vagueness and subjectivity in texts based on the lexicon. In English, the HYBRINFOX method ranked 1st with a macro F1 score of 0.7442 on the evaluation data. For the other languages, the method used a translation step into English, producing more mixed results (ranking 1st in Multilingual and 2nd in Italian over the baseline, but under the baseline in Bulgarian, German, and Arabic). We explain the principles of our hybrid approach, and outline ways in which the method could be improved for other languages besides English.
Paper Structure (6 sections, 4 equations, 2 figures, 3 tables)

This paper contains 6 sections, 4 equations, 2 figures, 3 tables.

Figures (2)

  • Figure 1: HYBRINFOX model combining BERT, sBERT and the four scores (S1, S2, S3, S4) calculated by the expert system VAGO for the task of objectivity versus subjectivity classification. The green arrows indicate the elements being trained.
  • Figure 2: Receiver Operating Characteristic (ROC) Curve for the "RoBERTa", the "RoBERTa + sBERT" and the "RoBERTa + sBERT + VAGO Scores" Systems Predictions.