Table of Contents
Fetching ...

Divide et Impera: Multi-Transformer Architectures for Complex NLP-Tasks

Solveig Helland, Elena Gavagnin, Alexandre de Spindler

TL;DR

The paper tackles the challenge of applying transformer models to complex, domain-specific NLP tasks where fine-tuning data is scarce and controllability is desired. It introduces a modular pipeline that splits a complex task into subtasks (bias classification, extraction, and reformulation) and assigns dedicated models to each step, chaining them to produce the final output. In experiments on gender bias removal using GPT-3, the multi-subtask approach (M-3) significantly outperforms single-model (M-1) and two-model (M-2) baselines in both F1 debiasing scores and the Mean Squared Word Neutrality metric, using a small fine-tuning dataset. This demonstrates improved accuracy and controllability with data-efficient fine-tuning, and suggests that the framework can generalize to other complex NLP tasks and bias types, enabling scalable, task-specific transformer deployment.

Abstract

The growing capabilities of transformer models pave the way for solving increasingly complex NLP tasks. A key to supporting application-specific requirements is the ability to fine-tune. However, compiling a fine-tuning dataset tailored to complex tasks is tedious and results in large datasets, limiting the ability to control transformer output. We present an approach in which complex tasks are divided into simpler subtasks. Multiple transformer models are fine-tuned to one subtask each, and lined up to accomplish the complex task. This simplifies the compilation of fine-tuning datasets and increases overall controllability. Using the example of reducing gender bias as a complex task, we demonstrate our approach and show that it performs better than using a single model.

Divide et Impera: Multi-Transformer Architectures for Complex NLP-Tasks

TL;DR

The paper tackles the challenge of applying transformer models to complex, domain-specific NLP tasks where fine-tuning data is scarce and controllability is desired. It introduces a modular pipeline that splits a complex task into subtasks (bias classification, extraction, and reformulation) and assigns dedicated models to each step, chaining them to produce the final output. In experiments on gender bias removal using GPT-3, the multi-subtask approach (M-3) significantly outperforms single-model (M-1) and two-model (M-2) baselines in both F1 debiasing scores and the Mean Squared Word Neutrality metric, using a small fine-tuning dataset. This demonstrates improved accuracy and controllability with data-efficient fine-tuning, and suggests that the framework can generalize to other complex NLP tasks and bias types, enabling scalable, task-specific transformer deployment.

Abstract

The growing capabilities of transformer models pave the way for solving increasingly complex NLP tasks. A key to supporting application-specific requirements is the ability to fine-tune. However, compiling a fine-tuning dataset tailored to complex tasks is tedious and results in large datasets, limiting the ability to control transformer output. We present an approach in which complex tasks are divided into simpler subtasks. Multiple transformer models are fine-tuned to one subtask each, and lined up to accomplish the complex task. This simplifies the compilation of fine-tuning datasets and increases overall controllability. Using the example of reducing gender bias as a complex task, we demonstrate our approach and show that it performs better than using a single model.
Paper Structure (10 sections, 1 equation, 1 figure, 2 tables)

This paper contains 10 sections, 1 equation, 1 figure, 2 tables.

Figures (1)

  • Figure 1: Activity diagram of the debiasing approach with three subtasks (grey boxes) and seven transformer models (white boxes) in total.