Deep Smart Contract Intent Detection

Youwei Huang; Sen Fang; Jianwen Li; Jiachun Tao; Bin Hu; Tao Zhang

Deep Smart Contract Intent Detection

Youwei Huang, Sen Fang, Jianwen Li, Jiachun Tao, Bin Hu, Tao Zhang

TL;DR

This paper tackles the problem of detecting developers' intents in smart contracts to mitigate internal risks in Web3. It introduces SmartIntentNN, a three-part framework that combines Universal Sentence Encoder embeddings, a K-means based intent-highlight module, and a BiLSTM-based multi-label classifier to identify 10 negative developer intents. On a dataset of over 40,000 Binance Smart Chain contracts, the method achieves an F1-score of 0.8633 and outperforms baselines including LSTM, CNN, and GPT-based models, while also providing insights into category-level performance and data imbalance effects. The approach enables automated, scalable auditing of smart contracts, reducing reliance on costly manual audits and enhancing security for DApps and their users.

Abstract

In recent years, research in software security has concentrated on identifying vulnerabilities in smart contracts to prevent significant losses of crypto assets on blockchains. Despite early successes in this area, detecting developers' intents in smart contracts has become a more pressing issue, as malicious intents have caused substantial financial losses. Unfortunately, existing research lacks effective methods for detecting development intents in smart contracts. To address this gap, we propose \textsc{SmartIntentNN} (Smart Contract Intent Neural Network), a deep learning model designed to automatically detect development intents in smart contracts. \textsc{SmartIntentNN} leverages a pre-trained sentence encoder to generate contextual representations of smart contracts, employs a K-means clustering model to identify and highlight prominent intent features, and utilizes a bidirectional LSTM-based deep neural network for multi-label classification. We trained and evaluated \textsc{SmartIntentNN} on a dataset containing over 40,000 real-world smart contracts, employing self-comparison baselines in our experimental setup. The results show that \textsc{SmartIntentNN} achieves an F1-score of 0.8633 in identifying intents across 10 distinct categories, outperforming all baselines and addressing the gap in smart contract detection by incorporating intent analysis.

Deep Smart Contract Intent Detection

TL;DR

Abstract

Paper Structure (26 sections, 24 equations, 8 figures, 2 tables)

This paper contains 26 sections, 24 equations, 8 figures, 2 tables.

Introduction
Motivation
Background
Malicious Smart Contract Intent
Sentence Embedding
Bidirectional LSTM
Dataset
Intent Labels
Code Cleaning
Smart Contract Code Tree
Approach
Smart Contract Embedding
Intent Highlight
Multi-label Classification
Evaluation
...and 11 more sections

Figures (8)

Figure 1: Examples of a smart contract with malicious intents. BSC address: 0xDDa7f9273a092655a1cF077FF0155d64000ccE2A.
Figure 2: A depiction of how developers exploit smart contracts for illegal gain, illustrating the process of creating and deploying contracts with malicious intents.
Figure 3: Dataset preprocessing steps: (i) download open-source smart contracts from the BSC blockchain and label them; (ii) merge and clean the source code; (iii) generate the smart contract code tree.
Figure 4: Overview of the SmartIntentNN workflow: (i) encode smart contracts through the Universal Sentence Encoder; (ii) identify and highlight features of developers' distinct intents in smart contracts using a K-means model; (iii) feed the intent-highlighted data into a DNN for learning the representations of smart contracts. The architecture of our DNN includes an input layer, a BiLSTM layer, and a dense layer to output the multi-label binary classification results.
Figure 5: A 3D coordinate system illustrating the principle of intent highlighting. The larger the vector angle deviates from the centroid, the stronger its intent.
...and 3 more figures

Deep Smart Contract Intent Detection

TL;DR

Abstract

Deep Smart Contract Intent Detection

Authors

TL;DR

Abstract

Table of Contents

Figures (8)