Diffusion-Augmented Reinforcement Learning for Robust Portfolio Optimization under Stress Scenarios

Himanshu Choudhary; Arishi Orra; Manoj Thakur

Diffusion-Augmented Reinforcement Learning for Robust Portfolio Optimization under Stress Scenarios

Himanshu Choudhary, Arishi Orra, Manoj Thakur

TL;DR

Robust portfolio optimization under regime shifts is framed as an $MDP$ with states $\mathcal{S}$, actions $\mathcal{A}$, transition $\mathbb{P}$, reward $\mathcal{R}$, and discount factor $\gamma$, and is vulnerable to data scarcity in crises. The proposed approach, called Diffusion-Augmented Reinforcement Learning (DARL), combines a conditional diffusion-based data generator with a PPO agent to augment training with crisis-like scenarios conditioned on crash intensity $c$, improving policy robustness and learning stability. Empirical results on Dow Jones 30 show superior cumulative and risk-adjusted performance (e.g., higher cumulative return and Sharpe, lower max drawdown) and resilience during unseen events such as the 2025 Tariff Crisis. This demonstrates a practical, scalable method to strengthen tail-risk resilience in DRL-driven portfolio management.

Abstract

In the ever-changing and intricate landscape of financial markets, portfolio optimisation remains a formidable challenge for investors and asset managers. Conventional methods often struggle to capture the complex dynamics of market behaviour and align with diverse investor preferences. To address this, we propose an innovative framework, termed Diffusion-Augmented Reinforcement Learning (DARL), which synergistically integrates Denoising Diffusion Probabilistic Models (DDPMs) with Deep Reinforcement Learning (DRL) for portfolio management. By leveraging DDPMs to generate synthetic market crash scenarios conditioned on varying stress intensities, our approach significantly enhances the robustness of training data. Empirical evaluations demonstrate that DARL outperforms traditional baselines, delivering superior risk-adjusted returns and resilience against unforeseen crises, such as the 2025 Tariff Crisis. This work offers a robust and practical methodology to bolster stress resilience in DRL-driven financial applications.

Diffusion-Augmented Reinforcement Learning for Robust Portfolio Optimization under Stress Scenarios

TL;DR

Robust portfolio optimization under regime shifts is framed as an

with states

, actions

, transition

, reward

, and discount factor

, and is vulnerable to data scarcity in crises. The proposed approach, called Diffusion-Augmented Reinforcement Learning (DARL), combines a conditional diffusion-based data generator with a PPO agent to augment training with crisis-like scenarios conditioned on crash intensity

, improving policy robustness and learning stability. Empirical results on Dow Jones 30 show superior cumulative and risk-adjusted performance (e.g., higher cumulative return and Sharpe, lower max drawdown) and resilience during unseen events such as the 2025 Tariff Crisis. This demonstrates a practical, scalable method to strengthen tail-risk resilience in DRL-driven portfolio management.

Diffusion-Augmented Reinforcement Learning for Robust Portfolio Optimization under Stress Scenarios

TL;DR

Abstract

Diffusion-Augmented Reinforcement Learning for Robust Portfolio Optimization under Stress Scenarios

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (2)