nchellwig at SemEval-2026 Task 3: Self-Consistent Structured Generation (SCSG) for Dimensional Aspect-Based Sentiment Analysis using Large Language Models

Nils Constantin Hellwig; Jakob Fehle; Udo Kruschwitz; Christian Wolff

nchellwig at SemEval-2026 Task 3: Self-Consistent Structured Generation (SCSG) for Dimensional Aspect-Based Sentiment Analysis using Large Language Models

Nils Constantin Hellwig, Jakob Fehle, Udo Kruschwitz, Christian Wolff

TL;DR

Evaluation across 6 languages and 8 language--domain combinations demonstrates that self-consistency with 15 executions yields statistically significant improvements over single-inference prompting, with the SCSG system ranking in the top seven across all settings.

Abstract

We present Self-Consistent Structured Generation (SCSG) for Dimensional Aspect-Based Sentiment Analysis in SemEval-2026 Task 3 (Track A). SCSG enhances prediction reliability by executing a LoRA-adapted large language model multiple times per instance, retaining only tuples that achieve a majority consensus across runs. To mitigate the computational overhead of multiple forward passes, we leverage vLLM's PagedAttention mechanism for efficient key--value cache reuse. Evaluation across 6 languages and 8 language--domain combinations demonstrates that self-consistency with 15 executions yields statistically significant improvements over single-inference prompting, with our system (leveraging Gemma 3) ranking in the top seven across all settings, achieving second place on three out of four English subsets and first place on Tatar-Restaurant for DimASTE.

nchellwig at SemEval-2026 Task 3: Self-Consistent Structured Generation (SCSG) for Dimensional Aspect-Based Sentiment Analysis using Large Language Models

TL;DR

Abstract

Paper Structure (26 sections, 1 equation, 2 figures, 7 tables)

This paper contains 26 sections, 1 equation, 2 figures, 7 tables.

Introduction
System Overview
Fine-Tuning Setup & Prompting Strategy
Training Hyperparameters
Prompt
Validation Module
Inference Optimization
Experimental Setup
Datasets
Training Configuration
Evaluation
Pilot study: Determining the Optimal Output Validation Mechanism
Results
Strong performance scores in the English language
Self-consistency improves test-set performance
...and 11 more sections

Figures (2)

Figure 1: Prompt used for SCSG. The prompt comprises descriptions of the considered sentiment elements (4 for DimASTE, 5 for DimASQP), explanations regarding the range of valence and arousal, the desired output format, and the example text for which ABSA is to be performed.
Figure 2: Self-consistency majority voting for DimASTE over $k=5$ runs. Aspect-sentiment pairs (ignoring valence-arousal values) appearing in $\geq \tau = \lceil k/2 \rceil$ runs are aggregated by averaging their valence and arousal values. The aggregation section shows the explicit calculation. Light blue rows highlight matching (Decor, nice) variants; light green rows highlight matching (service, spotty) variants. For DimASQP, in addition to the aspect term and sentiment polarity, the aspect category is considered as well.

nchellwig at SemEval-2026 Task 3: Self-Consistent Structured Generation (SCSG) for Dimensional Aspect-Based Sentiment Analysis using Large Language Models

TL;DR

Abstract

nchellwig at SemEval-2026 Task 3: Self-Consistent Structured Generation (SCSG) for Dimensional Aspect-Based Sentiment Analysis using Large Language Models

Authors

TL;DR

Abstract

Table of Contents

Figures (2)