Table of Contents
Fetching ...

A gentle push funziona benissimo: making instructed models in Italian via contrastive activation steering

Daniel Scalena, Elisabetta Fersini, Malvina Nissim

TL;DR

Italian steering can be successfully applied to different models, achieves performances comparable to, or even better than, fine-tuned models for Italian, and yields higher quality and consistency in Italian generations.

Abstract

Adapting models to a language that was only partially present in the pre-training data requires fine-tuning, which is expensive in terms of both data and computational resources. As an alternative to fine-tuning, we explore the potential of activation steering-based techniques to enhance model performance on Italian tasks. Through our experiments we show that Italian steering (i) can be successfully applied to different models, (ii) achieves performances comparable to, or even better than, fine-tuned models for Italian, and (iii) yields higher quality and consistency in Italian generations. We also discuss the utility of steering and fine-tuning in the contemporary LLM landscape where models are anyway getting high Italian performances even if not explicitly trained in this language.

A gentle push funziona benissimo: making instructed models in Italian via contrastive activation steering

TL;DR

Italian steering can be successfully applied to different models, achieves performances comparable to, or even better than, fine-tuned models for Italian, and yields higher quality and consistency in Italian generations.

Abstract

Adapting models to a language that was only partially present in the pre-training data requires fine-tuning, which is expensive in terms of both data and computational resources. As an alternative to fine-tuning, we explore the potential of activation steering-based techniques to enhance model performance on Italian tasks. Through our experiments we show that Italian steering (i) can be successfully applied to different models, (ii) achieves performances comparable to, or even better than, fine-tuned models for Italian, and (iii) yields higher quality and consistency in Italian generations. We also discuss the utility of steering and fine-tuning in the contemporary LLM landscape where models are anyway getting high Italian performances even if not explicitly trained in this language.

Paper Structure

This paper contains 18 sections, 4 equations, 1 figure, 8 tables.

Figures (1)

  • Figure 1: Graphical representation of all the correct answer combinations given by models on the ARC challenge. Each column shows a different combination of correct answers between all the different approaches with their respective cardinality(e.g. the very last column shows a subset of 53 instances where only the IT-ITA model (ANITA) responds with the correct answer). The steered and the IT-ITA models have limited overlap in their correct responses, highlighting differences in their improvements. The IT-ITA model loses the ability to answer some questions (74) that the Original model could while, at the same time, learning to answer new questions that the Original model couldn't (53). In contrast, steered models enhance their range of correct answers while retaining most of the original model's correct answers.