Convexified Message-Passing Graph Neural Networks
Saar Cohen, Noa Agmon, Uri Shaham
TL;DR
CGNNs address the non-convex training challenge of message-passing GNNs by mapping nonlinear graph filters into a reproducing kernel Hilbert space, turning learning into a convex optimization problem solvable with projected gradient methods. The framework provides a theoretically rigorous generalization bound that asymptotically matches the best possible GCN, and extends to deeper architectures via layer-wise training. By kernelizing nonlinear activations and relaxing low-rank constraints with nuclear-norm penalties, CGNNs retain the standard GCN modularity while enabling scalable, convex optimization and robust performance. Empirically, CGNNs achieve substantial accuracy gains (often 10–40%) over their non-convex counterparts across diverse graph benchmarks, while maintaining robustness in data-scarce regimes and avoiding heavy over-parameterization.
Abstract
Graph Neural Networks (GNNs) are key tools for graph representation learning, demonstrating strong results across diverse prediction tasks. In this paper, we present Convexified Message-Passing Graph Neural Networks (CGNNs), a novel and general framework that combines the power of message-passing GNNs with the tractability of convex optimization. By mapping their nonlinear filters into a reproducing kernel Hilbert space, CGNNs transform training into a convex optimization problem, which projected gradient methods can solve both efficiently and optimally. Convexity further allows CGNNs' statistical properties to be analyzed accurately and rigorously. For two-layer CGNNs, we establish rigorous generalization guarantees, showing convergence to the performance of an optimal GNN. To scale to deeper architectures, we adopt a principled layer-wise training strategy. Experiments on benchmark datasets show that CGNNs significantly exceed the performance of leading GNN models, obtaining 10-40% higher accuracy in most cases, underscoring their promise as a powerful and principled method with strong theoretical foundations. In rare cases where improvements are not quantitatively substantial, the convex models either slightly exceed or match the baselines, stressing their robustness and wide applicability. Though over-parameterization is often used to enhance performance in non-convex models, we show that our CGNNs yield shallow convex models that can surpass non-convex ones in accuracy and model compactness.
