Table of Contents
Fetching ...

Machine learning for option pricing: an empirical investigation of network architectures

Serena Della Corte, Laurens Van Mieghem, Antonis Papapantoleon, Jonas Papazoglou-Hennig

TL;DR

The paper investigates how neural network architecture influences the accuracy and training efficiency of learning option prices and implied volatilities. It systematically compares MLP, residual, highway, generalized highway, and DGM-based architectures across Black-Scholes, Heston, and transformed implied-volatility problems, including capacity-normalized analyses and real-market data. The generalized highway network consistently yields strong performance for pricing tasks, while a simplified DGM variant excels on the transformed implied-volatility mapping; results reveal a clear task-dependent pattern rather than a single universally best architecture. The findings highlight the importance of model design and data formulation in neural pricing models, offering practical guidance for practitioners in selecting architectures based on the target problem and available computational budgets. The study also confirms that results with synthetic data generalize to real implied-volatility data, supporting the relevance of the reported architectural trends for real-world applications.

Abstract

We consider the supervised learning problem of learning the price of an option or the implied volatility given appropriate input data (model parameters) and corresponding output data (option prices or implied volatilities). The majority of articles in this literature considers a (plain) feed forward neural network architecture in order to connect the neurons used for learning the function mapping inputs to outputs. In this article, motivated by methods in image classification and recent advances in machine learning methods for PDEs, we investigate empirically whether and how the choice of network architecture affects the accuracy and training time of a machine learning algorithm. We find that the generalized highway network architecture achieves the best performance, when considering the mean squared error and the training time as criteria, within the considered parameter budgets for the Black-Scholes and Heston option pricing problems. Considering the transformed implied volatility problem, a simplified DGM variant achieves the lowest error among the tested architectures. We also carry out a capacity-normalised comparison for completeness, where all architectures are evaluated with an equal number of parameters. Finally, for the implied volatility problem, we additionally include experiments using real market data.

Machine learning for option pricing: an empirical investigation of network architectures

TL;DR

The paper investigates how neural network architecture influences the accuracy and training efficiency of learning option prices and implied volatilities. It systematically compares MLP, residual, highway, generalized highway, and DGM-based architectures across Black-Scholes, Heston, and transformed implied-volatility problems, including capacity-normalized analyses and real-market data. The generalized highway network consistently yields strong performance for pricing tasks, while a simplified DGM variant excels on the transformed implied-volatility mapping; results reveal a clear task-dependent pattern rather than a single universally best architecture. The findings highlight the importance of model design and data formulation in neural pricing models, offering practical guidance for practitioners in selecting architectures based on the target problem and available computational budgets. The study also confirms that results with synthetic data generalize to real implied-volatility data, supporting the relevance of the reported architectural trends for real-world applications.

Abstract

We consider the supervised learning problem of learning the price of an option or the implied volatility given appropriate input data (model parameters) and corresponding output data (option prices or implied volatilities). The majority of articles in this literature considers a (plain) feed forward neural network architecture in order to connect the neurons used for learning the function mapping inputs to outputs. In this article, motivated by methods in image classification and recent advances in machine learning methods for PDEs, we investigate empirically whether and how the choice of network architecture affects the accuracy and training time of a machine learning algorithm. We find that the generalized highway network architecture achieves the best performance, when considering the mean squared error and the training time as criteria, within the considered parameter budgets for the Black-Scholes and Heston option pricing problems. Considering the transformed implied volatility problem, a simplified DGM variant achieves the lowest error among the tested architectures. We also carry out a capacity-normalised comparison for completeness, where all architectures are evaluated with an equal number of parameters. Finally, for the implied volatility problem, we additionally include experiments using real market data.
Paper Structure (35 sections, 24 equations, 29 figures, 23 tables)

This paper contains 35 sections, 24 equations, 29 figures, 23 tables.

Figures (29)

  • Figure 2.1: The market implied volatility for CBOE's SPX options.
  • Figure 3.2: A visual representation of a multilayer perceptron (MLP).
  • Figure 4.3: Schematic representation of a single MLP layer, cf.\ref{['eq:MLP_layer_operation']}.
  • Figure 4.4: Schematic representation of a single residual layer, cf.\ref{['eq:residual_layer_operation']}.
  • Figure 4.5: Schematic representation of a single highway layer, cf.\ref{['eq:simplified_highway_layer']}.
  • ...and 24 more figures