Table of Contents
Fetching ...

Improving Smart Contract Security with Contrastive Learning-based Vulnerability Detection

Yizhou Chen, Zeyu Sun, Zhihao Gong, Dan Hao

TL;DR

This work targets smart contract vulnerability detection (SCVD) by addressing the limitation that prior DL methods treat contracts as independent entities. It introduces Clear, a contrastive learning framework that leverages inter-contract correlations through correlation labels to learn fine-grained relationships among contracts, then fuses these correlation features with semantic contract representations for vulnerable detection. On a large-scale dataset of over 40K contracts, Clear outperforms 13 baselines, achieving an average F1 of 0.9452 and notable gains over prior methods, including a 9.73 percentage point improvement over the best prior model. The approach also demonstrates the CL module’s ability to boost other DL architectures and provides a reproducible pipeline, underscoring the practical impact of modeling cross-contract correlations in SCVD and potentially other software-security domains.

Abstract

Currently, smart contract vulnerabilities (SCVs) have emerged as a major factor threatening the transaction security of blockchain. Existing state-of-the-art methods rely on deep learning to mitigate this threat. They treat each input contract as an independent entity and feed it into a deep learning model to learn vulnerability patterns by fitting vulnerability labels. It is a pity that they disregard the correlation between contracts, failing to consider the commonalities between contracts of the same type and the differences among contracts of different types. As a result, the performance of these methods falls short of the desired level. To tackle this problem, we propose a novel Contrastive Learning Enhanced Automated Recognition Approach for Smart Contract Vulnerabilities, named Clear. In particular, Clear employs a contrastive learning (CL) model to capture the fine-grained correlation information among contracts and generates correlation labels based on the relationships between contracts to guide the training process of the CL model. Finally, it combines the correlation and the semantic information of the contract to detect SCVs. Through an empirical evaluation of a large-scale real-world dataset of over 40K smart contracts and compare 13 state-of-the-art baseline methods. We show that Clear achieves (1) optimal performance over all baseline methods; (2) 9.73%-39.99% higher F1-score than existing deep learning methods.

Improving Smart Contract Security with Contrastive Learning-based Vulnerability Detection

TL;DR

This work targets smart contract vulnerability detection (SCVD) by addressing the limitation that prior DL methods treat contracts as independent entities. It introduces Clear, a contrastive learning framework that leverages inter-contract correlations through correlation labels to learn fine-grained relationships among contracts, then fuses these correlation features with semantic contract representations for vulnerable detection. On a large-scale dataset of over 40K contracts, Clear outperforms 13 baselines, achieving an average F1 of 0.9452 and notable gains over prior methods, including a 9.73 percentage point improvement over the best prior model. The approach also demonstrates the CL module’s ability to boost other DL architectures and provides a reproducible pipeline, underscoring the practical impact of modeling cross-contract correlations in SCVD and potentially other software-security domains.

Abstract

Currently, smart contract vulnerabilities (SCVs) have emerged as a major factor threatening the transaction security of blockchain. Existing state-of-the-art methods rely on deep learning to mitigate this threat. They treat each input contract as an independent entity and feed it into a deep learning model to learn vulnerability patterns by fitting vulnerability labels. It is a pity that they disregard the correlation between contracts, failing to consider the commonalities between contracts of the same type and the differences among contracts of different types. As a result, the performance of these methods falls short of the desired level. To tackle this problem, we propose a novel Contrastive Learning Enhanced Automated Recognition Approach for Smart Contract Vulnerabilities, named Clear. In particular, Clear employs a contrastive learning (CL) model to capture the fine-grained correlation information among contracts and generates correlation labels based on the relationships between contracts to guide the training process of the CL model. Finally, it combines the correlation and the semantic information of the contract to detect SCVs. Through an empirical evaluation of a large-scale real-world dataset of over 40K smart contracts and compare 13 state-of-the-art baseline methods. We show that Clear achieves (1) optimal performance over all baseline methods; (2) 9.73%-39.99% higher F1-score than existing deep learning methods.
Paper Structure (25 sections, 11 equations, 4 figures, 3 tables)

This paper contains 25 sections, 11 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: An example of smart contracts.
  • Figure 2: Architecture of our method and the available methods.
  • Figure 3: Overview of Clear, which encompasses both the CL process, depicted by solid lines indicating the data flow, and the subsequent vulnerability detection process, represented by dotted arrows indicating the data flow.
  • Figure 4: The feature distribution of smart contracts at different epochs during the CL stage.