Robustness and Regularization in Hierarchical Re-Basin

Benedikt Franke; Florian Heinrich; Markus Lange; Arne Raulf

Robustness and Regularization in Hierarchical Re-Basin

Benedikt Franke, Florian Heinrich, Markus Lange, Arne Raulf

TL;DR

The paper addresses the challenge of merging many trained networks by introducing a hierarchical Git Re-Basin scheme that combines $2^n$ models across $n$ stages using pairwise teleports and a $0.5$ interpolation. It demonstrates that this hierarchical approach outperforms the existing MergeMany method and can induce adversarial and perturbation robustness—scaling with the number of merged models—while also reducing weight norms and Lipschitz bounds, albeit at the cost of accuracy relative to the unmerged baseline. The work also highlights inconsistencies with prior reports of a zero-accuracy barrier and reproducibility, emphasizing the need for more rigorous evaluation. Overall, the proposed method offers a scalable, robustness-aware fusion mechanism and contributes to the understanding of permutation-invariant model merging and linear mode connectivity.

Abstract

This paper takes a closer look at Git Re-Basin, an interesting new approach to merge trained models. We propose a hierarchical model merging scheme that significantly outperforms the standard MergeMany algorithm. With our new algorithm, we find that Re-Basin induces adversarial and perturbation robustness into the merged models, with the effect becoming stronger the more models participate in the hierarchical merging scheme. However, in our experiments Re-Basin induces a much bigger performance drop than reported by the original authors.

Robustness and Regularization in Hierarchical Re-Basin

TL;DR

The paper addresses the challenge of merging many trained networks by introducing a hierarchical Git Re-Basin scheme that combines

models across

stages using pairwise teleports and a

interpolation. It demonstrates that this hierarchical approach outperforms the existing MergeMany method and can induce adversarial and perturbation robustness—scaling with the number of merged models—while also reducing weight norms and Lipschitz bounds, albeit at the cost of accuracy relative to the unmerged baseline. The work also highlights inconsistencies with prior reports of a zero-accuracy barrier and reproducibility, emphasizing the need for more rigorous evaluation. Overall, the proposed method offers a scalable, robustness-aware fusion mechanism and contributes to the understanding of permutation-invariant model merging and linear mode connectivity.

Robustness and Regularization in Hierarchical Re-Basin

TL;DR

Abstract

Robustness and Regularization in Hierarchical Re-Basin

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (5)