Table of Contents
Fetching ...

HiLoMix: Robust High- and Low-Frequency Graph Learning Framework for Mixing Address Association

Xiaofan Tu, Tiantian Duan, Shuyi Miao, Hanwen Zhang, Yi Sun

TL;DR

HiLoMix addresses the deanonymization challenge of mixing addresses on Ethereum by tackling label noise and scarcity through HAMIG and a frequency-aware contrastive framework. It jointly trains high-pass and low-pass GNNs with confidence-weighted supervision and fuses their predictions via stacking over heterogeneous models. The approach achieves state-of-the-art results on classification and ranking metrics, with notable improvements such as $F_1$ gains of $5.69\%$, $AUC$ gains of $7.34\%$, and $MRR$ gains of $15.61\%$, and it provides a curated ground-truth dataset for future study. This work offers scalable, robust methods for identifying mixing address associations in large blockchain graphs, with practical implications for blockchain privacy and enforcement.

Abstract

As mixing services are increasingly being exploited by malicious actors for illicit transactions, mixing address association has emerged as a critical research task. A range of approaches have been explored, with graph-based models standing out for their ability to capture structural patterns in transaction networks. However, these approaches face two main challenges: label noise and label scarcity, leading to suboptimal performance and limited generalization. To address these, we propose HiLoMix, a graph-based learning framework specifically designed for mixing address association. First, we construct the Heterogeneous Attributed Mixing Interaction Graph (HAMIG) to enrich the topological structure. Second, we introduce frequency-aware graph contrastive learning that captures complementary structural signals from high- and low-frequency graph views. Third, we employ weak supervised learning that assigns confidence-based weighting to noisy labels. Then, we jointly train high-pass and low-pass GNNs using both unsupervised contrastive signals and confidence-based supervision to learn robust node representations. Finally, we adopt a stacking framework to fuse predictions from multiple heterogeneous models, further improving generalization and robustness. Experimental results demonstrate that HiLoMix outperforms existing methods in mixing address association.

HiLoMix: Robust High- and Low-Frequency Graph Learning Framework for Mixing Address Association

TL;DR

HiLoMix addresses the deanonymization challenge of mixing addresses on Ethereum by tackling label noise and scarcity through HAMIG and a frequency-aware contrastive framework. It jointly trains high-pass and low-pass GNNs with confidence-weighted supervision and fuses their predictions via stacking over heterogeneous models. The approach achieves state-of-the-art results on classification and ranking metrics, with notable improvements such as gains of , gains of , and gains of , and it provides a curated ground-truth dataset for future study. This work offers scalable, robust methods for identifying mixing address associations in large blockchain graphs, with practical implications for blockchain privacy and enforcement.

Abstract

As mixing services are increasingly being exploited by malicious actors for illicit transactions, mixing address association has emerged as a critical research task. A range of approaches have been explored, with graph-based models standing out for their ability to capture structural patterns in transaction networks. However, these approaches face two main challenges: label noise and label scarcity, leading to suboptimal performance and limited generalization. To address these, we propose HiLoMix, a graph-based learning framework specifically designed for mixing address association. First, we construct the Heterogeneous Attributed Mixing Interaction Graph (HAMIG) to enrich the topological structure. Second, we introduce frequency-aware graph contrastive learning that captures complementary structural signals from high- and low-frequency graph views. Third, we employ weak supervised learning that assigns confidence-based weighting to noisy labels. Then, we jointly train high-pass and low-pass GNNs using both unsupervised contrastive signals and confidence-based supervision to learn robust node representations. Finally, we adopt a stacking framework to fuse predictions from multiple heterogeneous models, further improving generalization and robustness. Experimental results demonstrate that HiLoMix outperforms existing methods in mixing address association.

Paper Structure

This paper contains 25 sections, 18 equations, 5 figures, 5 tables.

Figures (5)

  • Figure 1: Mixing process of Tornado Cash. Characters marked with a devil icon represent illicit users. Two characters sharing the same icon indicate that they have transferred funds through mixer.
  • Figure 2: The overview framework of our HiLoMix.
  • Figure 3: Performance of HiLoMix when training the base models in the stacking framework using different K of K-fold cross-validation.
  • Figure 4: Performance of HiLoMix under different settings of hyperparameters $\alpha$ and $\lambda$.
  • Figure 5: Dynamics of three label sets during training.