Compositional Distributed Learning for Multi-View Perception: A Maximal Coding Rate Reduction Perspective
Zhuojun Tian, Mehdi Bennis
TL;DR
The paper tackles distributed multi-view perception where each agent observes partial data and centralized data fusion is impractical. It proposes a compositional framework based on the Maximal Coding Rate Reduction ($MCR^2$) to learn discriminative, diverse subspaces locally and then fuse them into a global representation via periodic SVD-based basis fusion and a projection loss that enforces subspace alignment. The authors provide theoretical guarantees: the projection-induced change in the $MCR^2$ objective is bounded by the projection residual energy, and the fused subspace converges to the true global discriminative subspace under mild assumptions, with an explicit rate depending on local estimation errors. Empirically, the approach yields competitive accuracy on CIFAR-10 and ModelNet-10 while preserving cross-view diversity and intra-class structure, outperforming baselines that produce correlated or collapsed representations.
Abstract
In this letter, we formulate a compositional distributed learning framework for multi-view perception by leveraging the maximal coding rate reduction principle combined with subspace basis fusion. In the proposed algorithm, each agent conducts a periodic singular value decomposition on its learned subspaces and exchanges truncated basis matrices, based on which the fused subspaces are obtained. By introducing a projection matrix and minimizing the distance between the outputs and its projection, the learned representations are enforced towards the fused subspaces. It is proved that the trace on the coding-rate change is bounded and the consistency of basis fusion is guaranteed theoretically. Numerical simulations validate that the proposed algorithm achieves high classification accuracy while maintaining representations' diversity, compared to baselines showing correlated subspaces and coupled representations.
