Table of Contents
Fetching ...

Adaptive Dueling Double Deep Q-networks in Uniswap V3 Replication and Extension with Mamba

Zhaofeng Zhang

TL;DR

The work targets replication of an adaptive liquidity provision approach for Uniswap V3 using Dueling DDQN, detailing data acquisition from the Uniswap Subgraph, implementation specifics, and baseline contrasts. It then extends the framework with Mamba-DDQN, replacing the MLP with a Mamba SSM to better capture temporal structure and nonstationarity, accompanied by data pruning and reward shaping. The extension introduces new baselines (Buy-and-Hold, Daily Rebalancing) and reports that M-DDQN can achieve stronger theoretical grounding and improved performance in several tests, though results depend on dataset, seeds, and protocol details. Together, the study provides a more robust, temporally-aware RL formulation for automated liquidity provisioning with practical implications for gas, trading fees, and risk-adjusted PnL in DeFi markets.

Abstract

The report goes through the main steps of replicating and improving the article "Adaptive Liquidity Provision in Uniswap V3 with Deep Reinforcement Learning." The replication part includes how to obtain data from the Uniswap Subgraph, details of the implementation, and comments on the results. After the replication, I propose a new structure based on the original model, which combines Mamba with DDQN and a new reward function. In this new structure, I clean the data again and introduce two new baselines for comparison. As a result, although the model has not yet been applied to all datasets, it shows stronger theoretical support than the original model and performs better in some tests.

Adaptive Dueling Double Deep Q-networks in Uniswap V3 Replication and Extension with Mamba

TL;DR

The work targets replication of an adaptive liquidity provision approach for Uniswap V3 using Dueling DDQN, detailing data acquisition from the Uniswap Subgraph, implementation specifics, and baseline contrasts. It then extends the framework with Mamba-DDQN, replacing the MLP with a Mamba SSM to better capture temporal structure and nonstationarity, accompanied by data pruning and reward shaping. The extension introduces new baselines (Buy-and-Hold, Daily Rebalancing) and reports that M-DDQN can achieve stronger theoretical grounding and improved performance in several tests, though results depend on dataset, seeds, and protocol details. Together, the study provides a more robust, temporally-aware RL formulation for automated liquidity provisioning with practical implications for gas, trading fees, and risk-adjusted PnL in DeFi markets.

Abstract

The report goes through the main steps of replicating and improving the article "Adaptive Liquidity Provision in Uniswap V3 with Deep Reinforcement Learning." The replication part includes how to obtain data from the Uniswap Subgraph, details of the implementation, and comments on the results. After the replication, I propose a new structure based on the original model, which combines Mamba with DDQN and a new reward function. In this new structure, I clean the data again and introduce two new baselines for comparison. As a result, although the model has not yet been applied to all datasets, it shows stronger theoretical support than the original model and performs better in some tests.

Paper Structure

This paper contains 31 sections, 19 equations, 5 figures, 9 tables.

Figures (5)

  • Figure 1: Overview of the work
  • Figure 2: Comparison between the s of contract price
  • Figure 3: Caption
  • Figure 4: Correlation and distribution of features
  • Figure 5: Framework of Mamba DDQN