Table of Contents
Fetching ...

Do Robot Snakes Dream like Electric Sheep? Investigating the Effects of Architectural Inductive Biases on Hallucination

Jerry Huang, Prasanna Parthasarathi, Mehdi Rezagholizadeh, Boxing Chen, Sarath Chandar

TL;DR

Do Robot Snakes Dream like Electric Sheep? investigates whether architectural inductive biases in LLMs—specifically self-attention versus recurrent (and hybrids)—shape the propensity to hallucinate. The authors perform an extensive evaluation across 20 hallucination tasks spanning faithfulness and factuality, across open-source models from under 1B to approx. 70B parameters, and under varying instruction-tuning and data controls. They find that while hallucination is a general phenomenon across architectures, task-specific tendencies differ (e.g., recurrent/hybrid models can be more faithful at small sizes but instruction-tuning benefits are uneven; factuality improves with size for all). The work highlights the need for architecture-aware mitigation strategies and for designing more universal methods to improve robustness against hallucinations.

Abstract

The growth in prominence of large language models (LLMs) in everyday life can be largely attributed to their generative abilities, yet some of this is also owed to the risks and costs associated with their use. On one front is their tendency to hallucinate false or misleading information, limiting their reliability. On another is the increasing focus on the computational limitations associated with traditional self-attention based LLMs, which has brought about new alternatives, in particular recurrent models, meant to overcome them. Yet it remains uncommon to consider these two concerns simultaneously. Do changes in architecture exacerbate/alleviate existing concerns about hallucinations? Do they affect how and where they occur? Through an extensive evaluation, we study how these architecture-based inductive biases affect the propensity to hallucinate. While hallucination remains a general phenomenon not limited to specific architectures, the situations in which they occur and the ease with which specific types of hallucinations can be induced can significantly differ based on the model architecture. These findings highlight the need for better understanding both these problems in conjunction with each other, as well as consider how to design more universal techniques for handling hallucinations.

Do Robot Snakes Dream like Electric Sheep? Investigating the Effects of Architectural Inductive Biases on Hallucination

TL;DR

Do Robot Snakes Dream like Electric Sheep? investigates whether architectural inductive biases in LLMs—specifically self-attention versus recurrent (and hybrids)—shape the propensity to hallucinate. The authors perform an extensive evaluation across 20 hallucination tasks spanning faithfulness and factuality, across open-source models from under 1B to approx. 70B parameters, and under varying instruction-tuning and data controls. They find that while hallucination is a general phenomenon across architectures, task-specific tendencies differ (e.g., recurrent/hybrid models can be more faithful at small sizes but instruction-tuning benefits are uneven; factuality improves with size for all). The work highlights the need for architecture-aware mitigation strategies and for designing more universal methods to improve robustness against hallucinations.

Abstract

The growth in prominence of large language models (LLMs) in everyday life can be largely attributed to their generative abilities, yet some of this is also owed to the risks and costs associated with their use. On one front is their tendency to hallucinate false or misleading information, limiting their reliability. On another is the increasing focus on the computational limitations associated with traditional self-attention based LLMs, which has brought about new alternatives, in particular recurrent models, meant to overcome them. Yet it remains uncommon to consider these two concerns simultaneously. Do changes in architecture exacerbate/alleviate existing concerns about hallucinations? Do they affect how and where they occur? Through an extensive evaluation, we study how these architecture-based inductive biases affect the propensity to hallucinate. While hallucination remains a general phenomenon not limited to specific architectures, the situations in which they occur and the ease with which specific types of hallucinations can be induced can significantly differ based on the model architecture. These findings highlight the need for better understanding both these problems in conjunction with each other, as well as consider how to design more universal techniques for handling hallucinations.

Paper Structure

This paper contains 31 sections, 6 equations, 6 figures, 9 tables.

Figures (6)

  • Figure 1: Performance on PopQA (left) and MemoTrap (right). Recurrent and hybrid models significantly outperform similar pure attention-based alternatives on MemoTrap, but the opposite is true on PopQA.
  • Figure 2: Changes in factuality (top) and faithfulness (bottom) as models are increased in size. Score range between 0 and 100. Factuality always increases with the number of parameters, however faithfulness increases are only meaningful for pure-attention models.
  • Figure 3: Performance to scale values for factuality and faithfulness. Colors differentiate model families while shapes differentiate the model type. Factuality improves with model size (indicated by values >0) and the model type does not have a link with this relative improvement. Recurrent/hybrid models show low values for faithfulness, indicating that size increases generally do not benefit them, unlike attention models.
  • Figure 4: Performance of various base models on tasks within the Hallucination Leaderboard.
  • Figure 5: Change in task performance from base model to instruction fine-tuned model, for all tasks.
  • ...and 1 more figures