Table of Contents
Fetching ...

SourceP: Detecting Ponzi Schemes on Ethereum with Source Code

Pengcheng Lu, Liang Cai, Keting Yin

TL;DR

This paper proposes SourceP, a method to detect smart Ponzi schemes on the Ethereum platform using pre-trained models and data flow, which only requires using the source code of smart contracts as features to identify Ponzi schemes in smart contracts.

Abstract

As blockchain technology becomes more and more popular, a typical financial scam, the Ponzi scheme, has also emerged in the blockchain platform Ethereum. This Ponzi scheme deployed through smart contracts, also known as the smart Ponzi scheme, has caused a lot of economic losses and negative impacts. Existing methods for detecting smart Ponzi schemes on Ethereum mainly rely on bytecode features, opcode features, account features, and transaction behavior features of smart contracts, which are unable to truly characterize the behavioral features of Ponzi schemes, and thus generally perform poorly in terms of detection accuracy and false alarm rates. In this paper, we propose SourceP, a method to detect smart Ponzi schemes on the Ethereum platform using pre-trained models and data flow, which only requires using the source code of smart contracts as features. SourceP reduces the difficulty of data acquisition and feature extraction of existing detection methods. Specifically, we first convert the source code of a smart contract into a data flow graph and then introduce a pre-trained model based on learning code representations to build a classification model to identify Ponzi schemes in smart contracts. The experimental results show that SourceP achieves 87.2% recall and 90.7% F-score for detecting smart Ponzi schemes within Ethereum's smart contract dataset, outperforming state-of-the-art methods in terms of performance and sustainability. We also demonstrate through additional experiments that pre-trained models and data flow play an important contribution to SourceP, as well as proving that SourceP has a good generalization ability.

SourceP: Detecting Ponzi Schemes on Ethereum with Source Code

TL;DR

This paper proposes SourceP, a method to detect smart Ponzi schemes on the Ethereum platform using pre-trained models and data flow, which only requires using the source code of smart contracts as features to identify Ponzi schemes in smart contracts.

Abstract

As blockchain technology becomes more and more popular, a typical financial scam, the Ponzi scheme, has also emerged in the blockchain platform Ethereum. This Ponzi scheme deployed through smart contracts, also known as the smart Ponzi scheme, has caused a lot of economic losses and negative impacts. Existing methods for detecting smart Ponzi schemes on Ethereum mainly rely on bytecode features, opcode features, account features, and transaction behavior features of smart contracts, which are unable to truly characterize the behavioral features of Ponzi schemes, and thus generally perform poorly in terms of detection accuracy and false alarm rates. In this paper, we propose SourceP, a method to detect smart Ponzi schemes on the Ethereum platform using pre-trained models and data flow, which only requires using the source code of smart contracts as features. SourceP reduces the difficulty of data acquisition and feature extraction of existing detection methods. Specifically, we first convert the source code of a smart contract into a data flow graph and then introduce a pre-trained model based on learning code representations to build a classification model to identify Ponzi schemes in smart contracts. The experimental results show that SourceP achieves 87.2% recall and 90.7% F-score for detecting smart Ponzi schemes within Ethereum's smart contract dataset, outperforming state-of-the-art methods in terms of performance and sustainability. We also demonstrate through additional experiments that pre-trained models and data flow play an important contribution to SourceP, as well as proving that SourceP has a good generalization ability.
Paper Structure (20 sections, 8 equations, 5 figures, 4 tables)

This paper contains 20 sections, 8 equations, 5 figures, 4 tables.

Figures (5)

  • Figure 1: Flowchart of our method.
  • Figure 2: Flowchart of the traditional method.
  • Figure 3: Source code fragment of a smart Ponzi scheme.
  • Figure 4: The phase of input normalization, i.e., the process of converting source code into data flow. (a) source code fragment of smart Ponzi scheme; (b) conversion of source code into abstract syntax tree (AST); (c) identification of variable sequences in abstract syntax tree (AST); (d) generation of data flow graph by variable sequences.
  • Figure 5: SourceP's model structure.