Table of Contents
Fetching ...

Task-driven Layerwise Additive Activation Intervention

Hieu Trung Nguyen, Bao Nguyen, Binh Nguyen, Viet Anh Nguyen

TL;DR

This work tackles real-time adaptation of decoder-only language models to new contexts by introducing a layerwise additive activation intervention that learns a task-specific vector $Δ ∈ R^D$ added to the last-token activations. The method optimizes $Loss(Δ)$ with joint $ℓ_1$ and group-$ℓ_2$ regularization to promote sparsity and prevent overfitting, concentrating the intervention on a single layer to enable stable, compositional task learning. It demonstrates improvements on Rule Understanding and Opinion Dynamics tasks, outperforming zero-/few-shot prompting and other intervention baselines, and shows scalability across model sizes, including Llama3-8B. The approach offers a practical, sample-efficient mechanism for task-driven LM adaptation with potential for broad impact in controllable text generation and alignment.

Abstract

Modern language models (LMs) have significantly advanced generative modeling in natural language processing (NLP). Despite their success, LMs often struggle with adaptation to new contexts in real-time applications. A promising approach to task adaptation is activation intervention, which steers the LMs' generation process by identifying and manipulating the activations. However, existing interventions are highly dependent on heuristic rules or require many prompt inputs to determine effective interventions. This paper proposes a layer-wise additive activation intervention framework that optimizes the intervention process, thus enhancing the sample efficiency. We benchmark our framework on various datasets, demonstrating improvements in the accuracy of pre-trained LMs and competing intervention baselines.

Task-driven Layerwise Additive Activation Intervention

TL;DR

This work tackles real-time adaptation of decoder-only language models to new contexts by introducing a layerwise additive activation intervention that learns a task-specific vector added to the last-token activations. The method optimizes with joint and group- regularization to promote sparsity and prevent overfitting, concentrating the intervention on a single layer to enable stable, compositional task learning. It demonstrates improvements on Rule Understanding and Opinion Dynamics tasks, outperforming zero-/few-shot prompting and other intervention baselines, and shows scalability across model sizes, including Llama3-8B. The approach offers a practical, sample-efficient mechanism for task-driven LM adaptation with potential for broad impact in controllable text generation and alignment.

Abstract

Modern language models (LMs) have significantly advanced generative modeling in natural language processing (NLP). Despite their success, LMs often struggle with adaptation to new contexts in real-time applications. A promising approach to task adaptation is activation intervention, which steers the LMs' generation process by identifying and manipulating the activations. However, existing interventions are highly dependent on heuristic rules or require many prompt inputs to determine effective interventions. This paper proposes a layer-wise additive activation intervention framework that optimizes the intervention process, thus enhancing the sample efficiency. We benchmark our framework on various datasets, demonstrating improvements in the accuracy of pre-trained LMs and competing intervention baselines.

Paper Structure

This paper contains 18 sections, 4 equations, 2 figures, 6 tables.

Figures (2)

  • Figure 1: Average Exact Match for unregularized interventions at different layers. Results are averaged over five random seeds.
  • Figure 2: Intervened vector values across LLAMA3-8B attention heads (row-wise, from 1-32). Adding regularization promotes sparsity with the intervened values and desirable properties following previous empirical observations.