Edge-aware baselines for ogbn-proteins in PyTorch Geometric: species-wise normalization, post-hoc calibration, and cost-accuracy trade-offs
Aleksandar Stanković, Dejan Lisica
TL;DR
The paper evaluates edge-aware baselines for ogbn-proteins in PyTorch Geometric, focusing on how to convert 8-D edge evidence into node inputs and how to incorporate edges during message passing. It compares normalization schemes (BN/LN/CLN) and reports compute cost alongside ROC-AUC and micro-F1, finding GraphSAGE with sum-based edge-to-node features to be the strongest baseline, with post-hoc calibration substantially improving decision quality. The study also analyzes an edge-to-node aggregation ablation, per-species differences, and label dependencies, showing that simple aggregation choices can shift the accuracy–cost frontier and calibration metrics. Finally, the authors release standardized artifacts and scripts to enable reproducible experiments and further exploration of edge-aware baselines in cross-species protein-function prediction.
Abstract
We present reproducible, edge-aware baselines for ogbn-proteins in PyTorch Geometric (PyG). We study two system choices that dominate practice: (i) how 8-dimensional edge evidence is aggregated into node inputs, and (ii) how edges are used inside message passing. Our strongest baseline is GraphSAGE with sum-based edge-to-node features. We compare LayerNorm (LN), BatchNorm (BN), and a species-aware Conditional LayerNorm (CLN), and report compute cost (time, VRAM, parameters) together with accuracy (ROC-AUC) and decision quality. In our primary experimental setup (hidden size 512, 3 layers, 3 seeds), sum consistently beats mean and max; BN attains the best AUC, while CLN matches the AUC frontier with better thresholded F1. Finally, post-hoc per-label temperature scaling plus per-label thresholds substantially improves micro-F1 and expected calibration error (ECE) with negligible AUC change, and light label-correlation smoothing yields small additional gains. We release standardized artifacts and scripts used for all of the runs presented in the paper.
