Seeing the Goal, Missing the Truth: Human Accountability for AI Bias

Sean Cao; Wei Jiang; Hui Xu

Seeing the Goal, Missing the Truth: Human Accountability for AI Bias

Sean Cao, Wei Jiang, Hui Xu

TL;DR

This paper investigates whether revealing the downstream use of LLM outputs induces goal-conditioned distortions in intermediate signals. Using earnings-call transcripts, the authors compare goal-blind and goal-aware prompts to generate sentiment and competition scores that feed standard econometric forecasts of stock returns and EPS. They find that goal awareness improves in-sample predictive content before the model's knowledge cutoff but does not improve out-of-sample generalization after the cutoff, suggesting the effects arise from objective-conditioned optimization rather than genuine signal enhancement. The work highlights a human-centered channel of AI bias and emphasizes designing measurement workflows that remain agnostic to downstream tasks and rigorously tested out-of-sample to preserve credibility and robustness in AI-assisted research.

Abstract

This research explores how human-defined goals influence the behavior of Large Language Models (LLMs) through purpose-conditioned cognition. Using financial prediction tasks, we show that revealing the downstream use (e.g., predicting stock returns or earnings) of LLM outputs leads the LLM to generate biased sentiment and competition measures, even though these measures are intended to be downstream task-independent. Goal-aware prompting shifts intermediate measures toward the disclosed downstream objective. This purpose leakage improves performance before the LLM's knowledge cutoff, but with no advantage post-cutoff. AI bias due to "seeing the goal" is not an algorithmic flaw, but stems from human accountability in research design to ensure the statistical validity and reliability of AI-generated measurements.

Seeing the Goal, Missing the Truth: Human Accountability for AI Bias

TL;DR

Abstract

Paper Structure (14 sections, 4 equations, 3 figures, 5 tables)

This paper contains 14 sections, 4 equations, 3 figures, 5 tables.

Introduction
Experimental Design and Data
LLM Propmt and Scoring
Data and LLM model
Evaluation Metrics
Findings
Sentiment Scores and Stock Returns Prediction
Portfolio Sorts and Return Performance
Predictive Regressions and Out-of-Sample Performance
Earnings Prediction with Competition Scores
Predictive Regressions and Out-of-Sample Performance
Conclusion
Literature Review
Prompts

Figures (3)

Figure 1: Cumulative Long–Short Portfolio Returns from Goal-Aware and Goal-Blind Sentiment Scores
Figure 2: Monthly Out-of-Sample Forecast Accuracy Using Goal-Aware and Goal-Blind Sentiment Scores
Figure 3: Quarterly Out-of-Sample Forecast Accuracy Using GPT-Derived Competition Scores

Seeing the Goal, Missing the Truth: Human Accountability for AI Bias

TL;DR

Abstract

Seeing the Goal, Missing the Truth: Human Accountability for AI Bias

Authors

TL;DR

Abstract

Table of Contents

Figures (3)