Fishnets: Information-Optimal, Scalable Aggregation for Sets and Graphs
T. Lucas Makinen, Justin Alsing, Benjamin D. Wandelt
TL;DR
Fishnets offer an information-theoretic aggregation framework for sets and graph neighborhoods by learning per-object score embeddings and inverse-Fisher weights. Through Twin Fisher-Score Networks, they aggregate per-datapoint information to form near-optimal dataset summaries, achieving information saturation and robustness under distribution shifts and censorship. Empirically, Fishnets deliver scalable Bayesian inference and drop-in GNN aggregation that matches or surpasses state-of-the-art performance with far fewer learned parameters and faster training, notably on ogbn-proteins. These results suggest a practical pathway to info-rich, scalable summaries for heterogeneous data in SBI and graph learning contexts.
Abstract
Set-based learning is an essential component of modern deep learning and network science. Graph Neural Networks (GNNs) and their edge-free counterparts Deepsets have proven remarkably useful on ragged and topologically challenging datasets. The key to learning informative embeddings for set members is a specified aggregation function, usually a sum, max, or mean. We propose Fishnets, an aggregation strategy for learning information-optimal embeddings for sets of data for both Bayesian inference and graph aggregation. We demonstrate that i) Fishnets neural summaries can be scaled optimally to an arbitrary number of data objects, ii) Fishnets aggregations are robust to changes in data distribution, unlike standard deepsets, iii) Fishnets saturate Bayesian information content and extend to regimes where MCMC techniques fail and iv) Fishnets can be used as a drop-in aggregation scheme within GNNs. We show that by adopting a Fishnets aggregation scheme for message passing, GNNs can achieve state-of-the-art performance versus architecture size on ogbn-protein data over existing benchmarks with a fraction of learnable parameters and faster training time.
