BFTBrain: Adaptive BFT Consensus with Reinforcement Learning
Chenyuan Wu, Haoyun Qin, Mohammad Javad Amiri, Boon Thau Loo, Dahlia Malkhi, Ryan Marcus
TL;DR
BFTBrain tackles the problem of choosing the optimal Byzantine fault-tolerant protocol under dynamic workloads and fault conditions by employing a decentralized reinforcement-learning framework. It models protocol selection as a contextual multi-armed bandit, uses robust, consensus-based learning coordination to fuse local observations into a global decision, and switches protocols at epoch boundaries to maximize a user-defined performance metric. The approach demonstrates substantial throughput improvements over fixed protocols and prior learning-based methods, and proves robust against data pollution and hardware shifts. Its decentralized design, modest overhead, and plug-and-play compatibility across diverse hardware configurations make it highly practical for real-world SMR deployments. Overall, BFTBrain advances adaptive BFT by combining CMAB-based learning, robust coordination, and safe protocol switching with strong empirical gains.
Abstract
This paper presents BFTBrain, a reinforcement learning (RL) based Byzantine fault-tolerant (BFT) system that provides significant operational benefits: a plug-and-play system suitable for a broad set of hardware and network configurations, and adjusts effectively in real-time to changing fault scenarios and workloads. BFTBrain adapts to system conditions and application needs by switching between a set of BFT protocols in real-time. Two main advances contribute to BFTBrain's agility and performance. First, BFTBrain is based on a systematic, thorough modeling of metrics that correlate the performance of the studied BFT protocols with varying fault scenarios and workloads. These metrics are fed as features to BFTBrain's RL engine in order to choose the best-performing BFT protocols in real-time. Second, BFTBrain coordinates RL in a decentralized manner which is resilient to adversarial data pollution, where nodes share local metering values and reach the same learning output by consensus. As a result, in addition to providing significant operational benefits, BFTBrain improves throughput over fixed protocols by $18\%$ to $119\%$ under dynamic conditions and outperforms state-of-the-art learning based approaches by $44\%$ to $154\%$.
