Cryptocurrency Portfolio Management with Reinforcement Learning: Soft Actor--Critic and Deep Deterministic Policy Gradient Algorithms

Kamal Paykan

Cryptocurrency Portfolio Management with Reinforcement Learning: Soft Actor--Critic and Deep Deterministic Policy Gradient Algorithms

Kamal Paykan

TL;DR

This work tackles adaptive cryptocurrency portfolio management under highly volatile, nonstationary conditions by deploying deep reinforcement learning agents. It compares two continuous-action DRL approaches—Soft Actor--Critic (SAC) and Deep Deterministic Policy Gradient (DDPG)—within a unified trading environment, leveraging an LSTM-based feature extractor to model temporal dependencies. Empirical results show SAC delivering superior risk-adjusted performance and stability against DDPG and a Markowitz mean--variance baseline, highlighting the benefits of entropy regularization in volatile markets. The findings suggest that data-driven, reinforcement-learning strategies can provide robust, adaptive portfolio management for digital assets, with future work aiming to incorporate multimodal data and advanced architectures for scalability and interpretability.

Abstract

This paper proposes a reinforcement learning--based framework for cryptocurrency portfolio management using the Soft Actor--Critic (SAC) and Deep Deterministic Policy Gradient (DDPG) algorithms. Traditional portfolio optimization methods often struggle to adapt to the highly volatile and nonlinear dynamics of cryptocurrency markets. To address this, we design an agent that learns continuous trading actions directly from historical market data through interaction with a simulated trading environment. The agent optimizes portfolio weights to maximize cumulative returns while minimizing downside risk and transaction costs. Experimental evaluations on multiple cryptocurrencies demonstrate that the SAC and DDPG agents outperform baseline strategies such as equal-weighted and mean--variance portfolios. The SAC algorithm, with its entropy-regularized objective, shows greater stability and robustness in noisy market conditions compared to DDPG. These results highlight the potential of deep reinforcement learning for adaptive and data-driven portfolio management in cryptocurrency markets.

Cryptocurrency Portfolio Management with Reinforcement Learning: Soft Actor--Critic and Deep Deterministic Policy Gradient Algorithms

TL;DR

Abstract

Cryptocurrency Portfolio Management with Reinforcement Learning: Soft Actor--Critic and Deep Deterministic Policy Gradient Algorithms

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (8)