Table of Contents
Fetching ...

MLP, XGBoost, KAN, TDNN, and LSTM-GRU Hybrid RNN with Attention for SPX and NDX European Call Option Pricing

Boris Ter-Avanesov, Homayoon Beigi

TL;DR

The paper investigates pricing European call options for SPX and NDX using a suite of supervised-learning architectures, benchmarking against Black-Scholes (BS). It leverages the Universal Approximation Theorem and Kolmogorov-Arnold Representation Theorem (KART) to justify diverse approaches, including MLP, XGBoost, TDNN, RNN with self-attention, and Kolmogorov-Arnold Networks (KAN), trained on OptionMetrics IvyDB US data (2015–2023) with inputs such as $S/K$, $K$, $T-t$, $r$, and six volatility estimates. The LSTM-GRU hybrid RNN with attention delivers the best overall pricing performance, with XGBoost and KAN also showing strong results, while BS remains a weaker baseline, particularly failing to capture the volatility smile. The findings demonstrate that attention-enhanced temporal models and KANs can better calibrate option prices to market data, suggesting practical value for hedging and risk management and highlighting avenues for further improvements and model ensembles. The work advances the literature by comparing a broad spectrum of architectures on real market data and by integrating self-attention and KART-based constructs into option pricing, with potential impact on market practitioners and calibration workflows. $f(t,S)=e^{-r(T-t)} \, \mathbb{E}^{\mathbb{Q}}[\max(S_T-K,0)\,|\,\mathcal{F}_t]$ and $\frac{\partial f}{\partial t}+rS\frac{\partial f}{\partial S}+\frac{1}{2}\sigma^2S^2\frac{\partial^2 f}{\partial S^2}-rf=0$ are central equations guiding the BS baseline and stochastic pricing intuition used for comparison.

Abstract

We explore the performance of various artificial neural network architectures, including a multilayer perceptron (MLP), Kolmogorov-Arnold network (KAN), LSTM-GRU hybrid recursive neural network (RNN) models, and a time-delay neural network (TDNN) for pricing European call options. In this study, we attempt to leverage the ability of supervised learning methods, such as ANNs, KANs, and gradient-boosted decision trees, to approximate complex multivariate functions in order to calibrate option prices based on past market data. The motivation for using ANNs and KANs is the Universal Approximation Theorem and Kolmogorov-Arnold Representation Theorem, respectively. Specifically, we use S\&P 500 (SPX) and NASDAQ 100 (NDX) index options traded during 2015-2023 with times to maturity ranging from 15 days to over 4 years (OptionMetrics IvyDB US dataset). Black \& Scholes's (BS) PDE \cite{Black1973} model's performance in pricing the same options compared to real data is used as a benchmark. This model relies on strong assumptions, and it has been observed and discussed in the literature that real data does not match its predictions. Supervised learning methods are widely used as an alternative for calibrating option prices due to some of the limitations of this model. In our experiments, the BS model underperforms compared to all of the others. Also, the best TDNN model outperforms the best MLP model on all error metrics. We implement a simple self-attention mechanism to enhance the RNN models, significantly improving their performance. The best-performing model overall is the LSTM-GRU hybrid RNN model with attention. Also, the KAN model outperforms the TDNN and MLP models. We analyze the performance of all models by ticker, moneyness category, and over/under/correctly-priced percentage.

MLP, XGBoost, KAN, TDNN, and LSTM-GRU Hybrid RNN with Attention for SPX and NDX European Call Option Pricing

TL;DR

The paper investigates pricing European call options for SPX and NDX using a suite of supervised-learning architectures, benchmarking against Black-Scholes (BS). It leverages the Universal Approximation Theorem and Kolmogorov-Arnold Representation Theorem (KART) to justify diverse approaches, including MLP, XGBoost, TDNN, RNN with self-attention, and Kolmogorov-Arnold Networks (KAN), trained on OptionMetrics IvyDB US data (2015–2023) with inputs such as , , , , and six volatility estimates. The LSTM-GRU hybrid RNN with attention delivers the best overall pricing performance, with XGBoost and KAN also showing strong results, while BS remains a weaker baseline, particularly failing to capture the volatility smile. The findings demonstrate that attention-enhanced temporal models and KANs can better calibrate option prices to market data, suggesting practical value for hedging and risk management and highlighting avenues for further improvements and model ensembles. The work advances the literature by comparing a broad spectrum of architectures on real market data and by integrating self-attention and KART-based constructs into option pricing, with potential impact on market practitioners and calibration workflows. and are central equations guiding the BS baseline and stochastic pricing intuition used for comparison.

Abstract

We explore the performance of various artificial neural network architectures, including a multilayer perceptron (MLP), Kolmogorov-Arnold network (KAN), LSTM-GRU hybrid recursive neural network (RNN) models, and a time-delay neural network (TDNN) for pricing European call options. In this study, we attempt to leverage the ability of supervised learning methods, such as ANNs, KANs, and gradient-boosted decision trees, to approximate complex multivariate functions in order to calibrate option prices based on past market data. The motivation for using ANNs and KANs is the Universal Approximation Theorem and Kolmogorov-Arnold Representation Theorem, respectively. Specifically, we use S\&P 500 (SPX) and NASDAQ 100 (NDX) index options traded during 2015-2023 with times to maturity ranging from 15 days to over 4 years (OptionMetrics IvyDB US dataset). Black \& Scholes's (BS) PDE \cite{Black1973} model's performance in pricing the same options compared to real data is used as a benchmark. This model relies on strong assumptions, and it has been observed and discussed in the literature that real data does not match its predictions. Supervised learning methods are widely used as an alternative for calibrating option prices due to some of the limitations of this model. In our experiments, the BS model underperforms compared to all of the others. Also, the best TDNN model outperforms the best MLP model on all error metrics. We implement a simple self-attention mechanism to enhance the RNN models, significantly improving their performance. The best-performing model overall is the LSTM-GRU hybrid RNN model with attention. Also, the KAN model outperforms the TDNN and MLP models. We analyze the performance of all models by ticker, moneyness category, and over/under/correctly-priced percentage.
Paper Structure (13 sections, 112 equations)