Generic Selfish Mining MDP for DAG Protocols
Patrik Keller
TL;DR
The paper addresses the difficulty of evaluating Selfish Mining across DAG-based PoW protocols by introducing a generic, modular attack space that models any protocol as five pure functions operating on a BlockDAG. This framework automates the derivation of Selfish Mining MDPs and employs state-space compression and Probabilistic Termination to enable scalable policy search. Validation against Bitcoin shows the generic model reproduces established results, confirming correctness and compatibility with prior approaches, while the modular design enables fair cross-protocol comparisons and extension to Ethereum-like and other DAG protocols. The work offers a practical foundation for uniform incentive analysis and protocol design considerations in DAG-based systems, with avenues for future enhancements including short-term metrics, censorship considerations, and RL-based policy optimization.
Abstract
Selfish Mining is strategic rule-breaking to maximize rewards in proof-of-work protocols [3] and Markov Decision Processes (MDPs) are the preferred tool for finding optimal strategies in Bitcoin [4, 10] and similar linear chain protocols [12]. Protocols increasingly adopt non-sequential chain structures [11], for which MDP analysis is more involved [2]. To date, researchers have tailored specific attack spaces for each protocol [2, 4, 5, 7, 10, 12]. Assumptions differ, and validating and comparing results is difficult. To overcome this, we propose a generic attack space that supports a wide range of DAG protocols, including Ethereum, Fruitchains, and Parallel Proof-of-Work. Our approach is modular: we specify each protocol as one program, and then derive the Selfish Mining MDPs automatically.
