Empirical Asset Pricing with Large Language Model Agents

Junyan Cheng; Peter Chin

Empirical Asset Pricing with Large Language Model Agents

Junyan Cheng, Peter Chin

TL;DR

This paper introduces a hybrid asset pricing framework that leverages Large Language Model (LLM) agents to perform discretionary, memory-enabled news analysis, which is then integrated with manually curated financial factors in a neural pricing network. The approach demonstrates superior performance in both portfolio optimization (higher Sharpe ratios and lower drawdowns) and asset pricing errors (smaller alphas and stronger t-stats) relative to strong baselines, across a dataset combining three years of news with long-span market data. Key contributions include the LLM agent architecture with memory, a daily smoothed news embedding state, and a downstream hybrid network that fuses qualitative and quantitative signals; comprehensive ablations illustrate the value of iterative analysis, asset embeddings, and factor interactions. The results suggest that LLM-driven discretionary analysis can meaningfully enhance empirical asset pricing and offer practical avenues for more efficient capital allocation, with implications for both research and market practice.

Abstract

In this study, we introduce a novel asset pricing model leveraging the Large Language Model (LLM) agents, which integrates qualitative discretionary investment evaluations from LLM agents with quantitative financial economic factors manually curated, aiming to explain the excess asset returns. The experimental results demonstrate that our methodology surpasses traditional machine learning-based baselines in both portfolio optimization and asset pricing errors. Notably, the Sharpe ratio for portfolio optimization and the mean magnitude of $|α|$ for anomaly portfolios experienced substantial enhancements of 10.6\% and 10.0\% respectively. Moreover, we performed comprehensive ablation studies on our model and conducted a thorough analysis of the method to extract further insights into the proposed approach. Our results show effective evidence of the feasibility of applying LLMs in empirical asset pricing.

Empirical Asset Pricing with Large Language Model Agents

TL;DR

Abstract

for anomaly portfolios experienced substantial enhancements of 10.6\% and 10.0\% respectively. Moreover, we performed comprehensive ablation studies on our model and conducted a thorough analysis of the method to extract further insights into the proposed approach. Our results show effective evidence of the feasibility of applying LLMs in empirical asset pricing.

Paper Structure (32 sections, 2 equations, 11 figures, 5 tables)

This paper contains 32 sections, 2 equations, 11 figures, 5 tables.

Introduction
Related Work
Asset Pricing for Security.
Financial Machine Learning.
Large Language Model Agents for Finance.
Method
Discretionary Analysis with LLM Agent
Hybrid Asset Pricing Network
Experiment
Experiment Setting
Portfolio Optimization
Asset Pricing Error
Ablation Study
Agent Architecture Design
Analysis Depth and Width
...and 17 more sections

Figures (11)

Figure 1: The LLM agent produces analysis reports from the latest news through a multi-step refinement, incorporating past reports and domain knowledge from memory. For simplicity, the filter for irrelevant news is excluded. A macro and micro note, continuously updated by the latest analysis report, is used to provide additional context. The average embedding of daily analysis reports will be input into the pricing network along with daily manual factors.
Figure 2: Visualization of the keywords in the titles of news articles on the days over the highly complicated first two-year span in our dataset when the market return is positive (left) and negative (right).
Figure 3: The demonstration of our hybrid asset pricing network. The purple boxes mark the computational components. Yellow boxes are data, and the circled plus symbol means concatenation. The MSE loss is computed with predicted returns feedback to update the network.
Figure 4: The Sharpe ratio of equal-weighted portfolios given different numbers of $K$ and $N$.
Figure 5: Example of the Hybrid agent baseline that analyzes raw news without iterative refinement of analysis report as well as the macroeconomic and market trend notes.
...and 6 more figures

Empirical Asset Pricing with Large Language Model Agents

TL;DR

Abstract

Empirical Asset Pricing with Large Language Model Agents

Authors

TL;DR

Abstract

Table of Contents

Figures (11)