Table of Contents
Fetching ...

A New Branch-and-Bound Pruning Framework for $\ell_0$-Regularized Problems

Theo Guyard, Cédric Herzet, Clément Elvira, Ayşe-Nur Arslan

TL;DR

The paper addresses exact sparse learning via Branch-and-Bound for problems of the form $p^\star = \inf_{\mathbf{x}} f(\mathbf{A}\mathbf{x}) + g(\mathbf{x})$ with $g(\mathbf{x}) = \lambda\|\mathbf{x}\|_0 + \sum_i h(x_i)$, where conventional pruning tests rely on costly convex relaxations. It introduces a duality-based pruning strategy that computes a valid lower bound $\tilde{p}^\nu$ without solving relaxations and enables simultaneous testing of multiple regions, reducing overhead. The method relies on Fenchel-Rockafellar duality to form $D^\nu(\mathbf{u})$ and shows how to evaluate dual bounds for all direct successors at $\mathcal{O}(mn)$, with a principled way to propagate pruning across the tree. Numerical experiments on synthetic and real-world datasets show speedups of several orders of magnitude over standard solvers, expanding the tractable set of problems for exact sparse learning and feature selection.

Abstract

We consider the resolution of learning problems involving $\ell_0$-regularization via Branch-and-Bound (BnB) algorithms. These methods explore regions of the feasible space of the problem and check whether they do not contain solutions through "pruning tests". In standard implementations, evaluating a pruning test requires to solve a convex optimization problem, which may result in computational bottlenecks. In this paper, we present an alternative to implement pruning tests for some generic family of $\ell_0$-regularized problems. Our proposed procedure allows the simultaneous assessment of several regions and can be embedded in standard BnB implementations with a negligible computational overhead. We show through numerical simulations that our pruning strategy can improve the solving time of BnB procedures by several orders of magnitude for typical problems encountered in machine-learning applications.

A New Branch-and-Bound Pruning Framework for $\ell_0$-Regularized Problems

TL;DR

The paper addresses exact sparse learning via Branch-and-Bound for problems of the form with , where conventional pruning tests rely on costly convex relaxations. It introduces a duality-based pruning strategy that computes a valid lower bound without solving relaxations and enables simultaneous testing of multiple regions, reducing overhead. The method relies on Fenchel-Rockafellar duality to form and shows how to evaluate dual bounds for all direct successors at , with a principled way to propagate pruning across the tree. Numerical experiments on synthetic and real-world datasets show speedups of several orders of magnitude over standard solvers, expanding the tractable set of problems for exact sparse learning and feature selection.

Abstract

We consider the resolution of learning problems involving -regularization via Branch-and-Bound (BnB) algorithms. These methods explore regions of the feasible space of the problem and check whether they do not contain solutions through "pruning tests". In standard implementations, evaluating a pruning test requires to solve a convex optimization problem, which may result in computational bottlenecks. In this paper, we present an alternative to implement pruning tests for some generic family of -regularized problems. Our proposed procedure allows the simultaneous assessment of several regions and can be embedded in standard BnB implementations with a negligible computational overhead. We show through numerical simulations that our pruning strategy can improve the solving time of BnB procedures by several orders of magnitude for typical problems encountered in machine-learning applications.
Paper Structure (38 sections, 6 theorems, 78 equations, 5 figures, 2 tables, 1 algorithm)

This paper contains 38 sections, 6 theorems, 78 equations, 5 figures, 2 tables, 1 algorithm.

Key Result

Proposition 1

Let $\nu=(\mathcal{S}_{0},\mathcal{S}_{1},\mathcal{S}_{\bullet})$ be a node. Under hyp:pertfunc-hyp:zero-minimized, the function $(g^\nu)^{\star}(\cdot)$ is separable and defined coordinate-wise for all $v_{} \in \mathbf {_{+}}{{_{-}}{}}$R _+_- $\xspace$ as

Figures (5)

  • Figure 1: Illustration of the decision-tree exploration. We note that a relaxation has to be solved at each node of the tree to evaluate the lower bound \ref{['eq:std-lb']} involved in pruning test \ref{['eq:pruning-test']}. Here, the pruning test is passed for nodes $\nu_2$ and $\nu_4$.
  • Figure 2: Impact of simultaneous pruning tests on the tree exploration. Output of \ref{['algo:expansion-tree']} when applied with $\nu=\nu_0$, $\mathcal{I}_0^{\nu_0}=\emptyset$ and $\mathcal{I}_1^{\nu_0}=\{i_0,i_1\}$.
  • Figure 3: Performance profiles of different solvers.
  • Figure 4: Acceleration factor when implementing the simultaneous pruning tests in addition to the standard pruning strategy during the algorithm.
  • Figure 5: Time to construct solutions with a given sparsity level. The black dotted line represents the maximum time of one hour allowed.

Theorems & Definitions (11)

  • Definition 1
  • Proposition 1
  • Proposition 2
  • Proposition 3
  • Lemma 1
  • proof
  • Lemma 2
  • proof
  • Lemma 3
  • proof : Proof of \ref{['prop:dual-link']}
  • ...and 1 more