A Trust-Aware and Cost-Optimized Blockchain Oracle Selection Model with Deep Reinforcement Learning

Hengyang Zhang; Shike Li; Hang Bao; Sixing Wu; Jianbin Li

A Trust-Aware and Cost-Optimized Blockchain Oracle Selection Model with Deep Reinforcement Learning

Hengyang Zhang, Shike Li, Hang Bao, Sixing Wu, Jianbin Li

TL;DR

This work tackles the dual challenges of trust and cost in blockchain oracles for IoT-enabled DApps by introducing TCO-DRL, a framework that combines a multi-dimensional trust management mechanism with a deep reinforcement learning–driven oracle selection strategy. The trust module aggregates reliability, behavior, and token-based signals with a time-aware factor to produce dynamic reputation scores, while a DRL (DQN) controller learns to map data requests to high-reputation, low-cost oracles under budget constraints. Empirical results on Ethereum demonstrate that TCO-DRL significantly reduces allocations to malicious oracles (by over 39.1%) and saves more than 12% in costs, with robust performance under ME, OOA, and OSA attacks. The approach advances practical, attack-resilient, cost-aware data provisioning for DApps, and the authors provide open-source code for reproducibility.

Abstract

The rapid development of blockchain technology has driven the widespread application of decentralized applications (DApps) across various fields. However, DApps cannot directly access external data and rely on oracles to interact with off-chain data. As a bridge between blockchain and external data sources, oracles pose potential risks of malicious behavior, which may inject incorrect or harmful data, leading to trust and security issues. Additionally, with the surge in data requests, the disparity in oracle trustworthiness and costs has increased, making the dynamic selection of the most suitable oracle for each request a critical challenge. To address these issues, this paper proposes a Trust-Aware and Cost-Optimized Blockchain Oracle Selection Model with Deep Reinforcement Learning (TCO-DRL). The model incorporates a comprehensive trust management mechanism to evaluate oracle reputation from multiple dimensions and employs an improved sliding time window to monitor reputation changes in real time, enhancing resistance to malicious attacks. Moreover, TCO-DRL uses deep reinforcement learning algorithms to dynamically adapt to fluctuations in oracle reputation, ensuring the selection of high-reputation oracles while optimizing node selection, thereby reducing costs without compromising data quality. We implemented and validated TCO- DRL on Ethereum. Experimental results show that, compared to existing methods, TCO-DRL reduces the allocation rate to malicious oracles by more than 39.10% and saves over 12.00% in costs. Furthermore, simulated experiments on various malicious attacks further validate the robustness and effectiveness of TCO-DRL

A Trust-Aware and Cost-Optimized Blockchain Oracle Selection Model with Deep Reinforcement Learning

TL;DR

Abstract

A Trust-Aware and Cost-Optimized Blockchain Oracle Selection Model with Deep Reinforcement Learning

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (18)