Bine Trees: Enhancing Collective Operations by Optimizing Communication Locality
Daniele De Sensi, Saverio Pasqualoni, Lorenzo Piarulli, Tommaso Bonato, Seydou Ba, Matteo Turisini, Jens Domke, Torsten Hoefler
TL;DR
The paper tackles the bottleneck of communication locality in oversubscribed HPC networks by introducing Bine trees, a binomial-negabinary construction that halves the effective distance between communicating ranks. Building on distance-halving and distance-doubling variants, the authors derive formal definitions, rank representations in negabinary, and partner selection rules, enabling efficient gather/scatter, allreduce, alltoall, and related collectives across diverse topologies. Extensive experiments on four large-scale systems (Dragonfly, Dragonfly+, oversubscribed fat-tree, and torus) show up to 5x speedups and global-link traffic reductions up to 33%, across multiple MPI implementations and vector sizes. The results demonstrate Bine trees’ strong generality and practical impact for scalable collective communication in modern HPC and data-center networks. The work highlights Bine as a robust, topology-agnostic alternative to traditional binomial and butterfly-based collectives, with broad applicability to future heterogeneous and multi-technology clusters.
Abstract
Communication locality plays a key role in the performance of collective operations on large HPC systems, especially on oversubscribed networks where groups of nodes are fully connected internally but sparsely linked through global connections. We present Bine (binomial negabinary) trees, a family of collective algorithms that improve communication locality. Bine trees maintain the generality of binomial trees and butterflies while cutting global-link traffic by up to 33%. We implement eight Bine-based collectives and evaluate them on four large-scale supercomputers with Dragonfly, Dragonfly+, oversubscribed fat-tree, and torus topologies, achieving up to 5x speedups and consistent reductions in global-link traffic across different vector sizes and node counts.
