Table of Contents
Fetching ...

Linear Mode Connectivity in Differentiable Tree Ensembles

Ryuichi Kanoh, Mahito Sugiyama

TL;DR

This paper studies Linear Mode Connectivity (LMC) in differentiable tree ensembles, extending the LMC concept beyond neural networks to soft tree models. It formalizes a barrier measure $\mathcal{B}(\boldsymbol{\Theta}_A, \boldsymbol{\Theta}_B)$ and applies activation matching and weight matching to reduce barriers between independently trained models. It identifies three invariances—tree permutation, subtree flip, and splitting order—in tree ensembles and demonstrates that LMC is achievable when these invariances are accounted for or by using a decision-list architecture to remove some, reducing the barrier to permutation-only equivalence. Experiments on tabular datasets and MNIST illustrate substantial barrier reductions and discuss trade-offs between model merging efficiency and invariance-induced barriers.

Abstract

Linear Mode Connectivity (LMC) refers to the phenomenon that performance remains consistent for linearly interpolated models in the parameter space. For independently optimized model pairs from different random initializations, achieving LMC is considered crucial for understanding the stable success of the non-convex optimization in modern machine learning models and for facilitating practical parameter-based operations such as model merging. While LMC has been achieved for neural networks by considering the permutation invariance of neurons in each hidden layer, its attainment for other models remains an open question. In this paper, we first achieve LMC for soft tree ensembles, which are tree-based differentiable models extensively used in practice. We show the necessity of incorporating two invariances: subtree flip invariance and splitting order invariance, which do not exist in neural networks but are inherent to tree architectures, in addition to permutation invariance of trees. Moreover, we demonstrate that it is even possible to exclude such additional invariances while keeping LMC by designing decision list-based tree architectures, where such invariances do not exist by definition. Our findings indicate the significance of accounting for architecture-specific invariances in achieving LMC.

Linear Mode Connectivity in Differentiable Tree Ensembles

TL;DR

This paper studies Linear Mode Connectivity (LMC) in differentiable tree ensembles, extending the LMC concept beyond neural networks to soft tree models. It formalizes a barrier measure and applies activation matching and weight matching to reduce barriers between independently trained models. It identifies three invariances—tree permutation, subtree flip, and splitting order—in tree ensembles and demonstrates that LMC is achievable when these invariances are accounted for or by using a decision-list architecture to remove some, reducing the barrier to permutation-only equivalence. Experiments on tabular datasets and MNIST illustrate substantial barrier reductions and discuss trade-offs between model merging efficiency and invariance-induced barriers.

Abstract

Linear Mode Connectivity (LMC) refers to the phenomenon that performance remains consistent for linearly interpolated models in the parameter space. For independently optimized model pairs from different random initializations, achieving LMC is considered crucial for understanding the stable success of the non-convex optimization in modern machine learning models and for facilitating practical parameter-based operations such as model merging. While LMC has been achieved for neural networks by considering the permutation invariance of neurons in each hidden layer, its attainment for other models remains an open question. In this paper, we first achieve LMC for soft tree ensembles, which are tree-based differentiable models extensively used in practice. We show the necessity of incorporating two invariances: subtree flip invariance and splitting order invariance, which do not exist in neural networks but are inherent to tree architectures, in addition to permutation invariance of trees. Moreover, we demonstrate that it is even possible to exclude such additional invariances while keeping LMC by designing decision list-based tree architectures, where such invariances do not exist by definition. Our findings indicate the significance of accounting for architecture-specific invariances in achieving LMC.
Paper Structure (15 sections, 4 equations, 43 figures, 6 tables, 2 algorithms)

This paper contains 15 sections, 4 equations, 43 figures, 6 tables, 2 algorithms.

Figures (43)

  • Figure 1: A representative experimental result on the MiniBooNE misc_miniboone_particle_identification_199 dataset (left) and conceptual diagram of LMC for tree ensembles (right).
  • Figure 2: A soft decision tree with a single inner node and two leaf nodes.
  • Figure 3: Invariances inherent to tree ensembles.
  • Figure 4: Weighting strategy.
  • Figure 5: Tree architecture where neither subtree flip invariance nor splitting order invariance exists.
  • ...and 38 more figures