Smart Contracts in the Real World: A Statistical Exploration of External Data Dependencies
Yishun Wang, Xiaoqi Li, Shipeng Ye, Lei Xie, Ju Xing
TL;DR
The paper tackles security and reliability of smart contracts that depend on external data. It uses AST parsing, keyword mining, and manual domain/audit classification on a large 9,356-contract sample plus 249 audit reports to quantify external data usage and strategies. Key findings include 286 contracts (about 2.86%) interact with external data and a positive link between external data dependencies and code complexity measured with $V(G) = E - N + 2$. The authors open-source the datasets and code, and offer practical guidance for developers and auditors across DeFi, gaming, and supply chain contexts.
Abstract
Smart contracts with external data are crucial for functionality but pose security and reliability concerns. Statistical and quantitative studies on this interaction are scarce. To address this gap, we analyzed 10,500 smart contracts, retaining 9,356 valid ones after excluding outdated or erroneous ones. We employed code parsing to transform contract code into abstract syntax trees and identified keywords associated with external data dependencies. We conducted a quantitative analysis by comparing these keywords to a reference list. We manually classified the 9,356 valid smart contracts to ascertain their application domains and typical interaction methods with external data. Additionally, we created a database with this data to facilitate research on smart contract dependencies. Moreover, we reviewed over 3,600 security audit reports, manually identifying 249 (approximately 9%) related to external data interactions and categorized their dependencies. We explored the correlation between smart contract complexity and external data dependency to provide insights for their design and auditing processes. These studies aim to enhance the security and reliability of smart contracts and offer practical guidance to developers and auditors.
