ScamSweeper: Detecting Illegal Accounts in Web3 Scams via Transactions Analysis
Xiaoqi Li, Wenkai Li, Zhijie Liu, Meikang Qiu, Zhiquan Liu, Sen Nie, Zongwei Li, Shi Wu, Yuqing Zhang
TL;DR
This paper introduces ScamSweeper, a temporal-subgraph learning framework designed to detect web3 scams on Ethereum by jointly sampling transaction graphs with temporal structure and learning dynamic subgraph evolution. It introduces Structure Temporal Random Walk (STRWalk) to efficiently extract temporally annotated subgraphs and uses a directed graph encoder plus a transposed Transformer to capture both structural patterns and temporal dynamics. Empirical results on large-scale on-chain datasets show ScamSweeper outperforms state-of-the-art baselines in web3 scam detection and phishing detection, with substantial gains in F1-score and recall. The approach offers scalable, temporally aware detection for on-chain accounts, enabling more effective protection against evolving web3 scams and associated phishing activities.
Abstract
The web3 applications have recently been growing, especially on the Ethereum platform, starting to become the target of scammers. The web3 scams, imitating the services provided by legitimate platforms, mimic regular activity to deceive users. However, previous studies have primarily concentrated on de-anonymization and phishing nodes, neglecting the distinctive features of web3 scams. Moreover, the current phishing account detection tools utilize graph learning or sampling algorithms to obtain graph features. However, large-scale transaction networks with temporal attributes conform to a power-law distribution, posing challenges in detecting web3 scams. To overcome these challenges, we present ScamSweeper, a novel framework that emphasizes the dynamic evolution of transaction graphs, to identify web3 scams on Ethereum. ScamSweeper samples the network with a structure temporal random walk, which is an optimized sample walking method that considers both temporal attributes and structural information. Then, the directed graph encoder generates the features of each subgraph during different temporal intervals, sorting as a sequence. Moreover, a variational Transformer is utilized to extract the dynamic evolution in the subgraph sequence. Furthermore, we collect a large-scale transaction dataset consisting of web3 scams, phishing, and normal accounts, which are from the first 18 million block heights on Ethereum. Subsequently, we comprehensively analyze the distinctions in various attributes, including nodes, edges, and degree distribution. Our experiments indicate that ScamSweeper outperforms SIEGE, Ethident, and PDTGA in detecting web3 scams, achieving a weighted F1-score improvement of at least 17.29% with the base value of 0.59. In addition, ScamSweeper in phishing node detection achieves at least a 17.5% improvement over DGTSG and BERT4ETH in F1-score from 0.80.
