Provably Powerful Graph Neural Networks for Directed Multigraphs
Béni Egressy, Luc von Niederhäusern, Jovan Blanusa, Erik Altman, Roger Wattenhofer, Kubilay Atasu
TL;DR
This paper tackles the challenge of learning on directed multigraphs by proposing three simple adaptations—reverse message passing, directed multigraph port numbering, and ego IDs—that transform standard GNNs into provably powerful directed multigraph learners. The authors prove that, when combined, these adaptations enable a GNN to identify any directed subgraph pattern, and they validate the theory with synthetic pattern-detection tasks and practical financial-crime datasets (money laundering and phishing). Empirically, the adaptations yield large gains in minority-class F1 scores over baselines, with reverse MP driving accuracy on degree-out and port numbering enabling fan-in/out detection, while ego IDs contribute in select complex patterns. The approach also shows promise on real-world benchmarks, offering a scalable, extensible framework for directed multigraph analytics beyond finance, with avenues for future work in broader domains and complexity-aware trade-offs.
Abstract
This paper analyses a set of simple adaptations that transform standard message-passing Graph Neural Networks (GNN) into provably powerful directed multigraph neural networks. The adaptations include multigraph port numbering, ego IDs, and reverse message passing. We prove that the combination of these theoretically enables the detection of any directed subgraph pattern. To validate the effectiveness of our proposed adaptations in practice, we conduct experiments on synthetic subgraph detection tasks, which demonstrate outstanding performance with almost perfect results. Moreover, we apply our proposed adaptations to two financial crime analysis tasks. We observe dramatic improvements in detecting money laundering transactions, improving the minority-class F1 score of a standard message-passing GNN by up to 30%, and closely matching or outperforming tree-based and GNN baselines. Similarly impressive results are observed on a real-world phishing detection dataset, boosting three standard GNNs' F1 scores by around 15% and outperforming all baselines.
