Table of Contents
Fetching ...

On Theoretically-Driven LLM Agents for Multi-Dimensional Discourse Analysis

Maciej Uberna, Michał Wawer, Jarosław A. Chudziak, Marcin Koszowy

TL;DR

The paper addresses how reformulation functions in argumentative discourse can be analyzed beyond surface-level paraphrase. It proposes a theoretically grounded multi-agent system (MAS) with Retrieval-Augmented Generation (RAG) and a zero-shot baseline to classify rephrases into a five-category taxonomy (D-I-S-G-O) using a gold-standard US2016 debate corpus. Empirically, the RAG-enhanced MAS significantly outperforms the zero-shot baseline (macro F1 ~ 0.67 vs ~0.27; MCC ~ 0.64 vs ~0.16), with the largest gains in Generalisation and other pragmatic functions, suggesting that explicit theory improves function-aware discourse analysis. The work demonstrates a scalable, interpretable framework for identifying rhetorical strategies in contemporary discourse, with potential applications in misinformation detection and online harms mitigation.

Abstract

Identifying the strategic uses of reformulation in discourse remains a key challenge for computational argumentation. While LLMs can detect surface-level similarity, they often fail to capture the pragmatic functions of rephrasing, such as its role within rhetorical discourse. This paper presents a comparative multi-agent framework designed to quantify the benefits of incorporating explicit theoretical knowledge for this task. We utilise an dataset of annotated political debates to establish a new standard encompassing four distinct rephrase functions: Deintensification, Intensification, Specification, Generalisation, and Other, which covers all remaining types (D-I-S-G-O). We then evaluate two parallel LLM-based agent systems: one enhanced by argumentation theory via Retrieval-Augmented Generation (RAG), and an identical zero-shot baseline. The results reveal a clear performance gap: the RAG-enhanced agents substantially outperform the baseline across the board, with particularly strong advantages in detecting Intensification and Generalisation context, yielding an overall Macro F1-score improvement of nearly 30\%. Our findings provide evidence that theoretical grounding is not only beneficial but essential for advancing beyond mere paraphrase detection towards function-aware analysis of argumentative discourse. This comparative multi-agent architecture represents a step towards scalable, theoretically informed computational tools capable of identifying rhetorical strategies in contemporary discourse.

On Theoretically-Driven LLM Agents for Multi-Dimensional Discourse Analysis

TL;DR

The paper addresses how reformulation functions in argumentative discourse can be analyzed beyond surface-level paraphrase. It proposes a theoretically grounded multi-agent system (MAS) with Retrieval-Augmented Generation (RAG) and a zero-shot baseline to classify rephrases into a five-category taxonomy (D-I-S-G-O) using a gold-standard US2016 debate corpus. Empirically, the RAG-enhanced MAS significantly outperforms the zero-shot baseline (macro F1 ~ 0.67 vs ~0.27; MCC ~ 0.64 vs ~0.16), with the largest gains in Generalisation and other pragmatic functions, suggesting that explicit theory improves function-aware discourse analysis. The work demonstrates a scalable, interpretable framework for identifying rhetorical strategies in contemporary discourse, with potential applications in misinformation detection and online harms mitigation.

Abstract

Identifying the strategic uses of reformulation in discourse remains a key challenge for computational argumentation. While LLMs can detect surface-level similarity, they often fail to capture the pragmatic functions of rephrasing, such as its role within rhetorical discourse. This paper presents a comparative multi-agent framework designed to quantify the benefits of incorporating explicit theoretical knowledge for this task. We utilise an dataset of annotated political debates to establish a new standard encompassing four distinct rephrase functions: Deintensification, Intensification, Specification, Generalisation, and Other, which covers all remaining types (D-I-S-G-O). We then evaluate two parallel LLM-based agent systems: one enhanced by argumentation theory via Retrieval-Augmented Generation (RAG), and an identical zero-shot baseline. The results reveal a clear performance gap: the RAG-enhanced agents substantially outperform the baseline across the board, with particularly strong advantages in detecting Intensification and Generalisation context, yielding an overall Macro F1-score improvement of nearly 30\%. Our findings provide evidence that theoretical grounding is not only beneficial but essential for advancing beyond mere paraphrase detection towards function-aware analysis of argumentative discourse. This comparative multi-agent architecture represents a step towards scalable, theoretically informed computational tools capable of identifying rhetorical strategies in contemporary discourse.
Paper Structure (18 sections, 4 figures, 3 tables)

This paper contains 18 sections, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Classification of rephrase based on Kiljan2024.
  • Figure 2: MAS architecture. The Informed System (top) equips agents with RAG access to argumentation theory, while the Zero-Shot System (bottom) relies solely on pre-trained LLM knowledge. Both share identical agent roles and orchestration.
  • Figure 3: Confusion matrix of zero-shot MAS.
  • Figure 4: Confusion matrix of RAG enhanced MAS.