Table of Contents
Fetching ...

ArgInstruct: Specialized Instruction Fine-Tuning for Computational Argumentation

Maja Stahl, Timon Ziegenbein, Joonsuk Park, Henning Wachsmuth

TL;DR

ArgInstruct introduces a specialized instruction fine-tuning framework for computational argumentation (CA) by combining CA-focused seed tasks with a large CA-specific generation of 52k instructions. The authors curate 105 seed CA tasks, generate CA Instructions and instances via a self-instruct-like loop, and mix these with general instruction data to train CA-specialized LLMs. Empirical results show improved generalization to unseen CA tasks while preserving performance on general NLP tasks, with ArgInstruct outperforming several baselines in CA tasks and achieving strong zero-shot CA capabilities. The work provides a CA benchmark and a scalable method that could transfer to other domain-specific NLP areas requiring domain knowledge.

Abstract

Training large language models (LLMs) to follow instructions has significantly enhanced their ability to tackle unseen tasks. However, despite their strong generalization capabilities, instruction-following LLMs encounter difficulties when dealing with tasks that require domain knowledge. This work introduces a specialized instruction fine-tuning for the domain of computational argumentation (CA). The goal is to enable an LLM to effectively tackle any unseen CA tasks while preserving its generalization capabilities. Reviewing existing CA research, we crafted natural language instructions for 105 CA tasks to this end. On this basis, we developed a CA-specific benchmark for LLMs that allows for a comprehensive evaluation of LLMs' capabilities in solving various CA tasks. We synthesized 52k CA-related instructions, adapting the self-instruct process to train a CA-specialized instruction-following LLM. Our experiments suggest that CA-specialized instruction fine-tuning significantly enhances the LLM on both seen and unseen CA tasks. At the same time, performance on the general NLP tasks of the SuperNI benchmark remains stable.

ArgInstruct: Specialized Instruction Fine-Tuning for Computational Argumentation

TL;DR

ArgInstruct introduces a specialized instruction fine-tuning framework for computational argumentation (CA) by combining CA-focused seed tasks with a large CA-specific generation of 52k instructions. The authors curate 105 seed CA tasks, generate CA Instructions and instances via a self-instruct-like loop, and mix these with general instruction data to train CA-specialized LLMs. Empirical results show improved generalization to unseen CA tasks while preserving performance on general NLP tasks, with ArgInstruct outperforming several baselines in CA tasks and achieving strong zero-shot CA capabilities. The work provides a CA benchmark and a scalable method that could transfer to other domain-specific NLP areas requiring domain knowledge.

Abstract

Training large language models (LLMs) to follow instructions has significantly enhanced their ability to tackle unseen tasks. However, despite their strong generalization capabilities, instruction-following LLMs encounter difficulties when dealing with tasks that require domain knowledge. This work introduces a specialized instruction fine-tuning for the domain of computational argumentation (CA). The goal is to enable an LLM to effectively tackle any unseen CA tasks while preserving its generalization capabilities. Reviewing existing CA research, we crafted natural language instructions for 105 CA tasks to this end. On this basis, we developed a CA-specific benchmark for LLMs that allows for a comprehensive evaluation of LLMs' capabilities in solving various CA tasks. We synthesized 52k CA-related instructions, adapting the self-instruct process to train a CA-specialized instruction-following LLM. Our experiments suggest that CA-specialized instruction fine-tuning significantly enhances the LLM on both seen and unseen CA tasks. At the same time, performance on the general NLP tasks of the SuperNI benchmark remains stable.

Paper Structure

This paper contains 42 sections, 3 figures, 10 tables.

Figures (3)

  • Figure 1: Comparison of fine-tuning methods: (a) Optimizing an LLM for a CA task on input-output pairs. (b) Making an LLM instruction-following on highly diverse tasks. (c) Our method: Making an LLM an instruction-following CA specialist on diverse CA-specific tasks.
  • Figure 2: Overview of our methodology: We manually craft CA-specific seed tasks and prompt an LLM to generate new CA-specific tasks in a loop by (1) generating new instructions, (2) filtering them for CA relevance and novelty, and (3) generating corresponding instances. (4) After postprocessing, the generated CA-specific tasks from the task pool are combined with existing general tasks to specialize an LLM for CA using instruction fine-tuning.
  • Figure 3: The 20 most common root verbs (inner circle) and their top four direct noun objects (outer circle) in our generated instructions highlight their CA focus.