Table of Contents
Fetching ...

Your AI, Not Your View: The Bias of LLMs in Investment Analysis

Hoyoung Lee, Junhyuk Seo, Suhwan Park, Junhyeong Lee, Wonbin Ahn, Chanyeol Choi, Alejandro Lopez-Lira, Yongjae Lee

TL;DR

The paper addresses the problem of knowledge conflict in LLM-based financial analysis by introducing a three-stage experimental framework that systematically elicits and verifies latent biases across 427 S&P 500 stocks and multiple models. It demonstrates robust, model-specific biases toward the Technology sector, large-cap stocks, and a contrarian investment view, which manifest as confirmation bias when exposed to counter-evidence, with entropy correlating bias strength to decision uncertainty. The authors provide a rigorous methodology for bias elicitation and verification (evidence volume and intensity) and show cross-model variability, informing practitioners about the risks of using LLMs for investment decisions. The work emphasizes trustworthy AI in finance, offering a public leaderboard for broader benchmarking and urging auditing and mitigation to align AI outputs with user intents and institutional objectives.

Abstract

In finance, Large Language Models (LLMs) face frequent knowledge conflicts arising from discrepancies between their pre-trained parametric knowledge and real-time market data. These conflicts are especially problematic in real-world investment services, where a model's inherent biases can misalign with institutional objectives, leading to unreliable recommendations. Despite this risk, the intrinsic investment biases of LLMs remain underexplored. We propose an experimental framework to investigate emergent behaviors in such conflict scenarios, offering a quantitative analysis of bias in LLM-based investment analysis. Using hypothetical scenarios with balanced and imbalanced arguments, we extract the latent biases of models and measure their persistence. Our analysis, centered on sector, size, and momentum, reveals distinct, model-specific biases. Across most models, a tendency to prefer technology stocks, large-cap stocks, and contrarian strategies is observed. These foundational biases often escalate into confirmation bias, causing models to cling to initial judgments even when faced with increasing counter-evidence. A public leaderboard benchmarking bias across a broader set of models is available at https://linqalpha.com/leaderboard

Your AI, Not Your View: The Bias of LLMs in Investment Analysis

TL;DR

The paper addresses the problem of knowledge conflict in LLM-based financial analysis by introducing a three-stage experimental framework that systematically elicits and verifies latent biases across 427 S&P 500 stocks and multiple models. It demonstrates robust, model-specific biases toward the Technology sector, large-cap stocks, and a contrarian investment view, which manifest as confirmation bias when exposed to counter-evidence, with entropy correlating bias strength to decision uncertainty. The authors provide a rigorous methodology for bias elicitation and verification (evidence volume and intensity) and show cross-model variability, informing practitioners about the risks of using LLMs for investment decisions. The work emphasizes trustworthy AI in finance, offering a public leaderboard for broader benchmarking and urging auditing and mitigation to align AI outputs with user intents and institutional objectives.

Abstract

In finance, Large Language Models (LLMs) face frequent knowledge conflicts arising from discrepancies between their pre-trained parametric knowledge and real-time market data. These conflicts are especially problematic in real-world investment services, where a model's inherent biases can misalign with institutional objectives, leading to unreliable recommendations. Despite this risk, the intrinsic investment biases of LLMs remain underexplored. We propose an experimental framework to investigate emergent behaviors in such conflict scenarios, offering a quantitative analysis of bias in LLM-based investment analysis. Using hypothetical scenarios with balanced and imbalanced arguments, we extract the latent biases of models and measure their persistence. Our analysis, centered on sector, size, and momentum, reveals distinct, model-specific biases. Across most models, a tendency to prefer technology stocks, large-cap stocks, and contrarian strategies is observed. These foundational biases often escalate into confirmation bias, causing models to cling to initial judgments even when faced with increasing counter-evidence. A public leaderboard benchmarking bias across a broader set of models is available at https://linqalpha.com/leaderboard

Paper Structure

This paper contains 25 sections, 6 equations, 9 figures, 4 tables.

Figures (9)

  • Figure 1: A conceptual illustration of knowledge conflict in LLM-based financial services. Even when a firm targets a specific investment theme (e.g., Energy), the LLM’s inherent preferences (e.g., Technology) may override user intent, producing biased and inconsistent recommendations.
  • Figure 2: The three-stage experimental framework: (1) Generating balanced evidence, (2) Eliciting bias through knowledge conflict, and (3) Verifying the resulting bias against counter evidence.
  • Figure 3: Sector bias scores for each evaluated LLM. Scores represent the mean of three independent sets of 10 trials; the standard deviation in parentheses reflects the variation across the three sets. Green indicates a positive (buy) bias and red a negative (sell) bias. A strong bias toward the Technology sector is evident in most models.
  • Figure 4: Size bias scores for each evaluated LLM across four market-capitalization quantiles (Q1: largest, Q4: smallest). Bias scores consistently decline as company size decreases.
  • Figure 5: Win rates for Contrarian versus Momentum preferences for each model. The results show a consistent preference for the Contrarian view across most models.
  • ...and 4 more figures