Table of Contents
Fetching ...

Large Model Strategic Thinking, Small Model Efficiency: Transferring Theory of Mind in Large Language Models

Nunzio Lore, Sepehr Ilami, Babak Heydari

TL;DR

The fine-tuned smaller language model consistently bridged the gap in performance between the smaller pre-trained version of the model and its larger relative and that its improvements extended in areas and contexts beyond the ones provided in the training examples, including on out-of-sample scenarios that include completely different game structures.

Abstract

As the performance of larger, newer Large Language Models continues to improve for strategic Theory of Mind (ToM) tasks, the demand for these state-of-the-art models increases commensurately. However, their deployment is costly both in terms of processing power and time. In this paper, we investigate the feasibility of creating smaller, highly-performing specialized algorithms by way of fine-tuning. To do this, we first present a large pre-trained model with 20 unique scenarios that combine different social contexts with games of varying social dilemmas, record its answers, and use them for Q&A fine-tuning on a smaller model of the same family. Our focus is on in-context game-theoretic decision-making, the same domain within which human interaction occurs and that requires both a theory of mind (or a semblance thereof) and an understanding of social dynamics. The smaller model is therefore trained not just on the answers provided, but also on the motivations provided by the larger model, which should contain advice and guidelines to navigate both strategic dilemmas and social cues. We find that the fine-tuned smaller language model consistently bridged the gap in performance between the smaller pre-trained version of the model and its larger relative and that its improvements extended in areas and contexts beyond the ones provided in the training examples, including on out-of-sample scenarios that include completely different game structures. On average for all games, through fine-tuning, the smaller model showed a 46% improvement measured as alignment towards the behavior of the larger model, with 100% representing indistinguishable behavior. When presented with out-of-sample social contexts and games, the fine-tuned model still displays remarkable levels of alignment, reaching an improvement of 18% and 28% respectively.

Large Model Strategic Thinking, Small Model Efficiency: Transferring Theory of Mind in Large Language Models

TL;DR

The fine-tuned smaller language model consistently bridged the gap in performance between the smaller pre-trained version of the model and its larger relative and that its improvements extended in areas and contexts beyond the ones provided in the training examples, including on out-of-sample scenarios that include completely different game structures.

Abstract

As the performance of larger, newer Large Language Models continues to improve for strategic Theory of Mind (ToM) tasks, the demand for these state-of-the-art models increases commensurately. However, their deployment is costly both in terms of processing power and time. In this paper, we investigate the feasibility of creating smaller, highly-performing specialized algorithms by way of fine-tuning. To do this, we first present a large pre-trained model with 20 unique scenarios that combine different social contexts with games of varying social dilemmas, record its answers, and use them for Q&A fine-tuning on a smaller model of the same family. Our focus is on in-context game-theoretic decision-making, the same domain within which human interaction occurs and that requires both a theory of mind (or a semblance thereof) and an understanding of social dynamics. The smaller model is therefore trained not just on the answers provided, but also on the motivations provided by the larger model, which should contain advice and guidelines to navigate both strategic dilemmas and social cues. We find that the fine-tuned smaller language model consistently bridged the gap in performance between the smaller pre-trained version of the model and its larger relative and that its improvements extended in areas and contexts beyond the ones provided in the training examples, including on out-of-sample scenarios that include completely different game structures. On average for all games, through fine-tuning, the smaller model showed a 46% improvement measured as alignment towards the behavior of the larger model, with 100% representing indistinguishable behavior. When presented with out-of-sample social contexts and games, the fine-tuned model still displays remarkable levels of alignment, reaching an improvement of 18% and 28% respectively.
Paper Structure (11 sections, 5 figures, 1 table)

This paper contains 11 sections, 5 figures, 1 table.

Figures (5)

  • Figure 1: Overview of the methods employed in this paper. We pair all games and scenarios to generate 20 unique combinations, which form the backbone of our dataset. We then submit each combination to each model, and obtain 300 observations per combination. For LLaMa2-70b, we ask for an answer and a motivation; we ask the other models only for their answers. We use the answers coming from LLaMa2-70b to perform LORA on a small, pre-trained LLaMa2-7b. The fine-tuned model is then again queried like the pre-trained model, and once that is done, we collect all data and measure the impact of fine-tuning on preferences.
  • Figure 2: Preliminary investigation of differences in responses between LLaMa2-7b and LLaMa2-70b, grouped by context and game. Clockwise, from the left: propensity to cooperate in LLaMa2-7b grouped by context; propensity to cooperate in LLaMa2-7b grouped by game; propensity to cooperate in LLaMa2-70b grouped by context; propensity to cooperate in LLaMa2-70b grouped by games. Notably, LLaMa2-7b is almost entirely indifferent to context and game and displays a remarkable bias for choosing cooperation, whereas LLaMa2-70b adapts to new contexts and game structures to a remarkable extent.
  • Figure 3: Improvement (on the $y$ axis) for the fine-tuned version of LLaMa2-7b on within-sample scenarios grouped by (a) game or (b) context. The red vertical line represents an average improvement.
  • Figure 4: Improvement for the fine-tuned version of LLaMa2-7b on out-of-sample context and in-sample games grouped by (a) game or (b) context. We adopt the same conventions as in Fig.\ref{['fig:game']}. We keep the structure of the payoffs identical for games both within and out-of-sample. As opposed to within-sample scenarios, we observe no exacerbation or overcorrection.
  • Figure 5: Average contribution to the public good for each model, with standard errors in black.