AiRacleX: Automated Detection of Price Oracle Manipulations via LLM-Driven Knowledge Mining and Prompt Generation
Bo Gao, Yuan Wang, Qingsong Wei, Yong Liu, Rick Siow Mong Goh, David Lo
TL;DR
AiRacleX tackles price oracle manipulation in DeFi by introducing a fully LLM-driven detection framework that grounds prompts in domain knowledge extracted from top-tier literature. It uses a three-model pipeline—Domain Knowledge Synthesizer, Prompt Generator, and Auditor—to generate context-aware prompts and autonomously audit smart contracts for POM vulnerabilities. Across real-world attacked projects and Code4rena datasets, AiRacleX achieves substantial recall gains (notably a 2.58x improvement over a SOTA baseline) while maintaining competitive precision, and can match or exceed human-curated approaches with fully automated knowledge synthesis. The framework also demonstrates the feasibility of replacing commercial models with open-source LLMs, enhancing privacy and security for developers, and points to extensions toward broader vulnerability classes and automated knowledge evolution.
Abstract
Decentralized finance (DeFi) applications depend on accurate price oracles to ensure secure transactions, yet these oracles are highly vulnerable to manipulation, enabling attackers to exploit smart contract vulnerabilities for unfair asset valuation and financial gain. Detecting such manipulations traditionally relies on the manual effort of experienced experts, presenting significant challenges. In this paper, we propose a novel LLM-driven framework that automates the detection of price oracle manipulations by leveraging the complementary strengths of different LLM models (LLMs). Our approach begins with domain-specific knowledge extraction, where an LLM model synthesizes precise insights about price oracle vulnerabilities from top-tier academic papers, eliminating the need for profound expertise from developers or auditors. This knowledge forms the foundation for a second LLM model to generate structured, context-aware chain of thought prompts, which guide a third LLM model in accurately identifying manipulation patterns in smart contracts. We validate the effectiveness of framework through experiments on 60 known vulnerabilities from 46 real-world DeFi attacks or projects spanning 2021 to 2023. The best performing combination of LLMs (Haiku-Haiku-4o-mini) identified by AiRacleX demonstrate a 2.58-times improvement in recall (0.667 vs 0.259) compared to the state-of-the-art tool GPTScan, while maintaining comparable precision. Furthermore, our framework demonstrates the feasibility of replacing commercial models with open-source alternatives, enhancing privacy and security for developers.
