Can Large Language Models perform Relation-based Argument Mining?
Deniz Gorur, Antonio Rago, Francesca Toni
TL;DR
This study investigates binary relation-based argument mining (RbAM) using open-source Large Language Models (LLMs) with simple primed prompting, eliminating the need for fine-tuning. Across ten datasets, LLMs such as Llama 70B-4bit and Mixtral 8x7B-4bit outperform a strong RoBERTa baseline, achieving a macro F1 of up to about 75 and demonstrating cross-dataset robustness. The work highlights the practicality of prompt-based LLMs for cross-domain RbAM, while also outlining trade-offs in inference speed and hardware demands. It also points to future directions, including moving to ternary RbAM and improving attack-prediction accuracy, with potential downstream benefits for online debate platforms and evidence-gathering tasks.
Abstract
Argument mining (AM) is the process of automatically extracting arguments, their components and/or relations amongst arguments and components from text. As the number of platforms supporting online debate increases, the need for AM becomes ever more urgent, especially in support of downstream tasks. Relation-based AM (RbAM) is a form of AM focusing on identifying agreement (support) and disagreement (attack) relations amongst arguments. RbAM is a challenging classification task, with existing methods failing to perform satisfactorily. In this paper, we show that general-purpose Large Language Models (LLMs), appropriately primed and prompted, can significantly outperform the best performing (RoBERTa-based) baseline. Specifically, we experiment with two open-source LLMs (Llama-2 and Mistral) with ten datasets.
