Table of Contents
Fetching ...

Normalisation and Initialisation Strategies for Graph Neural Networks in Blockchain Anomaly Detection

Dang Sy Duy, Nguyen Duy Chien, Kapil Dev, Jeff Nijsse

TL;DR

This work presents a systematic ablation of initialisation and normalisation strategies across three GNN architectures (GCN, GAT, and GraphSAGE) on the Elliptic Bitcoin dataset, revealing that initialisation and normalisation are architecture-dependent.

Abstract

Graph neural networks (GNNs) offer a principled approach to financial fraud detection by jointly learning from node features and transaction graph topology. However, their effectiveness on real-world anti-money laundering (AML) benchmarks depends critically on training practices such as specifically weight initialisation and normalisation that remain underexplored. We present a systematic ablation of initialisation and normalisation strategies across three GNN architectures (GCN, GAT, and GraphSAGE) on the Elliptic Bitcoin dataset. Our experiments reveal that initialisation and normalisation are architecture-dependent: GraphSAGE achieves the strongest performance with Xavier initialisation alone, GAT benefits most from combining GraphNorm with Xavier initialisation, while GCN shows limited sensitivity to these modifications. These findings offer practical, architecture-specific guidance for deploying GNNs in AML pipelines for datasets with severe class imbalance. We release a reproducible experimental framework with temporal data splits, seeded runs, and full ablation results.

Normalisation and Initialisation Strategies for Graph Neural Networks in Blockchain Anomaly Detection

TL;DR

This work presents a systematic ablation of initialisation and normalisation strategies across three GNN architectures (GCN, GAT, and GraphSAGE) on the Elliptic Bitcoin dataset, revealing that initialisation and normalisation are architecture-dependent.

Abstract

Graph neural networks (GNNs) offer a principled approach to financial fraud detection by jointly learning from node features and transaction graph topology. However, their effectiveness on real-world anti-money laundering (AML) benchmarks depends critically on training practices such as specifically weight initialisation and normalisation that remain underexplored. We present a systematic ablation of initialisation and normalisation strategies across three GNN architectures (GCN, GAT, and GraphSAGE) on the Elliptic Bitcoin dataset. Our experiments reveal that initialisation and normalisation are architecture-dependent: GraphSAGE achieves the strongest performance with Xavier initialisation alone, GAT benefits most from combining GraphNorm with Xavier initialisation, while GCN shows limited sensitivity to these modifications. These findings offer practical, architecture-specific guidance for deploying GNNs in AML pipelines for datasets with severe class imbalance. We release a reproducible experimental framework with temporal data splits, seeded runs, and full ablation results.
Paper Structure (23 sections, 3 figures, 3 tables)

This paper contains 23 sections, 3 figures, 3 tables.

Figures (3)

  • Figure 1: The base model architecture. The evaluated architecture follows a modular design consisting of an input layer (166 transaction features), followed by 1--3 GNN layers with 3 different types (GCN, GAT, and GraphSAGE). Each GNN variant is followed by GraphNorm and dropout (0.08--0.64) for regularisation. The processed features are passed to an embedding layer (64-128 dimensions) and then to a classification head (linear layer) for binary prediction. All layers use standard weight initialisation methods.
  • Figure 2: Convergence behaviour of GCN, GAT, and GraphSAGE on the Elliptic dataset. (\ref{['fig:training_loss']}) Training loss curves show that GraphSAGE converges faster than GCN and GAT. (\ref{['fig:validation_score']}) Validation score trajectories highlight that GraphSAGE consistently achieves higher validation scores, while GAT exhibits more variance and GCN stabilises more slowly.
  • Figure 3: Performance comparison of GCN, GAT, and GraphSAGE on the Elliptic dataset. (\ref{['fig:metrics_bar']}) Overall performance across metrics including AUC, AUPRC, and F1 at multiple thresholds. (\ref{['fig:pr_scatter']}) Precision-Recall scatter plots at 90% confidence, illustrating the distribution across architectures.