Table of Contents
Fetching ...

In Agents We Trust, but Who Do Agents Trust? Latent Source Preferences Steer LLM Generations

Mohammad Aflah Khan, Mahsa Amani, Soumi Das, Bishwamittra Ghosh, Qinyuan Wu, Krishna P. Gummadi, Manish Gupta, Abhilasha Ravichander

TL;DR

Through controlled experiments on twelve LLMs from six model providers, spanning both synthetic and real-world tasks, it is found that several models consistently exhibit strong and predictable source preferences.

Abstract

Agents based on Large Language Models (LLMs) are increasingly being deployed as interfaces to information on online platforms. These agents filter, prioritize, and synthesize information retrieved from the platforms' back-end databases or via web search. In these scenarios, LLM agents govern the information users receive, by drawing users' attention to particular instances of retrieved information at the expense of others. While much prior work has focused on biases in the information LLMs themselves generate, less attention has been paid to the factors that influence what information LLMs select and present to users. We hypothesize that when information is attributed to specific sources (e.g., particular publishers, journals, or platforms), current LLMs exhibit systematic latent source preferences- that is, they prioritize information from some sources over others. Through controlled experiments on twelve LLMs from six model providers, spanning both synthetic and real-world tasks, we find that several models consistently exhibit strong and predictable source preferences. These preferences are sensitive to contextual framing, can outweigh the influence of content itself, and persist despite explicit prompting to avoid them. They also help explain phenomena such as the observed left-leaning skew in news recommendations in prior work. Our findings advocate for deeper investigation into the origins of these preferences, as well as for mechanisms that provide users with transparency and control over the biases guiding LLM-powered agents.

In Agents We Trust, but Who Do Agents Trust? Latent Source Preferences Steer LLM Generations

TL;DR

Through controlled experiments on twelve LLMs from six model providers, spanning both synthetic and real-world tasks, it is found that several models consistently exhibit strong and predictable source preferences.

Abstract

Agents based on Large Language Models (LLMs) are increasingly being deployed as interfaces to information on online platforms. These agents filter, prioritize, and synthesize information retrieved from the platforms' back-end databases or via web search. In these scenarios, LLM agents govern the information users receive, by drawing users' attention to particular instances of retrieved information at the expense of others. While much prior work has focused on biases in the information LLMs themselves generate, less attention has been paid to the factors that influence what information LLMs select and present to users. We hypothesize that when information is attributed to specific sources (e.g., particular publishers, journals, or platforms), current LLMs exhibit systematic latent source preferences- that is, they prioritize information from some sources over others. Through controlled experiments on twelve LLMs from six model providers, spanning both synthetic and real-world tasks, we find that several models consistently exhibit strong and predictable source preferences. These preferences are sensitive to contextual framing, can outweigh the influence of content itself, and persist despite explicit prompting to avoid them. They also help explain phenomena such as the observed left-leaning skew in news recommendations in prior work. Our findings advocate for deeper investigation into the origins of these preferences, as well as for mechanisms that provide users with transparency and control over the biases guiding LLM-powered agents.
Paper Structure (50 sections, 2 equations, 28 figures, 6 tables)

This paper contains 50 sections, 2 equations, 28 figures, 6 tables.

Figures (28)

  • Figure 1: Spread of Preference % across models and sources. More results in Appendix \ref{['app:std_dev_pref_percentage']}.
  • Figure 2: Heatmaps of correlations. (a) Agreement between rankings for the Political Leaning News Set. Further results are presented in Appendix \ref{['app:corr-across-models']}. (b) Agreement between direct and indirect rankings per model. Empty cells in (b) indicate cases where uniform preferences prevented ranking.
  • Figure 3: Research Set Ranking Across Different Paper Topics (Indirect Experiments). Further results are presented in Appendix \ref{['app:ranking-plots']}.
  • Figure 4: Correlations between indirect rankings across identities for GPT-4.1-Mini. Further results are presented in Appendix \ref{['app:corr-across-identities']}.
  • Figure 5: Correlations between a rational ranking of credentials and the direct (lower triangle) or indirect (upper triangle) rankings, across models and source sets.
  • ...and 23 more figures