Risk-averse policies for natural gas futures trading using distributional reinforcement learning

Félicien Hêche; Biagio Nigro; Oussama Barakat; Stephan Robert-Nicoud

Risk-averse policies for natural gas futures trading using distributional reinforcement learning

Félicien Hêche, Biagio Nigro, Oussama Barakat, Stephan Robert-Nicoud

TL;DR

This work addresses trading in volatile energy markets by applying distributional reinforcement learning to natural gas futures. It compares C51, QR-DQN, and IQN against classical RL and an Extra-Trees baseline, demonstrating that distributional methods outperform traditional RL by over 32% in mean P&L and can yield risk-averse policies through CVaR optimization. By training with CVaR$_\alpha$ objectives, the study shows adjustable risk aversion, with IQN$_{\alpha}$ providing the broadest and most robust range of risk profiles, while QR-DQN's behavior is more variable. The results suggest distributional RL is a promising framework for developing adaptable, risk-sensitive trading strategies in volatile markets and point to future work on alternative risk measures, ensembles, and broader market applications.

Abstract

Financial markets have experienced significant instabilities in recent years, creating unique challenges for trading and increasing interest in risk-averse strategies. Distributional Reinforcement Learning (RL) algorithms, which model the full distribution of returns rather than just expected values, offer a promising approach to managing market uncertainty. This paper investigates this potential by studying the effectiveness of three distributional RL algorithms for natural gas futures trading and exploring their capacity to develop risk-averse policies. Specifically, we analyze the performance and behavior of Categorical Deep Q-Network (C51), Quantile Regression Deep Q-Network (QR-DQN), and Implicit Quantile Network (IQN). To the best of our knowledge, these algorithms have never been applied in a trading context. These policies are compared against five Machine Learning (ML) baselines, using a detailed dataset provided by Predictive Layer SA, a company supplying ML-based strategies for energy trading. The main contributions of this study are as follows. (1) We demonstrate that distributional RL algorithms significantly outperform classical RL methods, with C51 achieving performance improvement of more than 32\%. (2) We show that training C51 and IQN to maximize CVaR produces risk-sensitive policies with adjustable risk aversion. Specifically, our ablation studies reveal that lower CVaR confidence levels increase risk aversion, while higher levels decrease it, offering flexible risk management options. In contrast, QR-DQN shows less predictable behavior. These findings emphasize the potential of distributional RL for developing adaptable, risk-averse trading strategies in volatile markets.

Risk-averse policies for natural gas futures trading using distributional reinforcement learning

TL;DR

objectives, the study shows adjustable risk aversion, with IQN

providing the broadest and most robust range of risk profiles, while QR-DQN's behavior is more variable. The results suggest distributional RL is a promising framework for developing adaptable, risk-sensitive trading strategies in volatile markets and point to future work on alternative risk measures, ensembles, and broader market applications.

Abstract

Paper Structure (32 sections, 23 equations, 5 figures, 8 tables, 1 algorithm)

This paper contains 32 sections, 23 equations, 5 figures, 8 tables, 1 algorithm.

Introduction
Related work
Machine learning in finance
Machine learning for trading
Reinforcement learning in finance
Reinforcement learning for trading
Preliminaries
Conditional Value-at-Risk
Reinforcement learning
Trading agents
Extra-Trees
Classical RL
DQN
Prioritized DQN
Dueling DQN
...and 17 more sections

Figures (5)

Figure 1: Illustration of the neural architecture used in Dueling DQN.
Figure 2: Illustration of the neural architecture used in IQN.
Figure 3: Overview of the proposed approach illustrating states constructions and their subsequent use.
Figure 4: Illustration of training and testing periods employed in the four conducted experiments.
Figure 5: Distributions of $\Delta_{t}$ with corresponding kernel density estimates across the four testing conditions.

Risk-averse policies for natural gas futures trading using distributional reinforcement learning

TL;DR

Abstract

Risk-averse policies for natural gas futures trading using distributional reinforcement learning

Authors

TL;DR

Abstract

Table of Contents

Figures (5)