Table of Contents
Fetching ...

A scalable mixed-integer conic optimization approach to cardinality-constrained Poisson regression with safe screening

Kota Kurihara, Yoichi Izunaga

TL;DR

The paper tackles high-dimensional cardinality-constrained Poisson regression by formulating it as a mixed-integer conic optimization problem and introducing a Fenchel-conjugate-based safe screening to prune irrelevant features. It leverages exponential-cone representations and perspective relaxation to obtain stronger continuous relaxations, and develops a dual-based greedy upper-bound strategy to drive safe screening and provide good initial solutions. Computational experiments on synthetic data demonstrate substantial problem-size reductions and fast convergence, enabling solutions with tens of thousands of features and up to 2{,}000 observations while maintaining small optimality gaps. The work offers a scalable, interpretable feature-selection tool for high-dimensional Poisson regression, with potential extensions to real data and to broader count models such as negative binomial regression.

Abstract

This paper introduces a novel approach for cardinality-constrained Poisson regression to address feature selection challenges in high-dimensional count data. We formulate the problem as a mixed-integer conic optimization, enabling the use of modern solvers for optimal solutions. To enhance computational efficiency, we develop a safe screening based on Fenchel conjugates, thereby effectively removing irrelevant features before optimization. Experiments on synthetic datasets demonstrate that our safe screening significantly reduces the problem size, leading to substantial improvements in computational time. Our approach can solve Poisson regression problems with tens of thousands of features, exceeding the scale of previous studies. This work provides a valuable tool for interpretable feature selection in high-dimensional Poisson regression.

A scalable mixed-integer conic optimization approach to cardinality-constrained Poisson regression with safe screening

TL;DR

The paper tackles high-dimensional cardinality-constrained Poisson regression by formulating it as a mixed-integer conic optimization problem and introducing a Fenchel-conjugate-based safe screening to prune irrelevant features. It leverages exponential-cone representations and perspective relaxation to obtain stronger continuous relaxations, and develops a dual-based greedy upper-bound strategy to drive safe screening and provide good initial solutions. Computational experiments on synthetic data demonstrate substantial problem-size reductions and fast convergence, enabling solutions with tens of thousands of features and up to 2{,}000 observations while maintaining small optimality gaps. The work offers a scalable, interpretable feature-selection tool for high-dimensional Poisson regression, with potential extensions to real data and to broader count models such as negative binomial regression.

Abstract

This paper introduces a novel approach for cardinality-constrained Poisson regression to address feature selection challenges in high-dimensional count data. We formulate the problem as a mixed-integer conic optimization, enabling the use of modern solvers for optimal solutions. To enhance computational efficiency, we develop a safe screening based on Fenchel conjugates, thereby effectively removing irrelevant features before optimization. Experiments on synthetic datasets demonstrate that our safe screening significantly reduces the problem size, leading to substantial improvements in computational time. Our approach can solve Poisson regression problems with tens of thousands of features, exceeding the scale of previous studies. This work provides a valuable tool for interpretable feature selection in high-dimensional Poisson regression.

Paper Structure

This paper contains 17 sections, 1 theorem, 36 equations, 2 figures, 6 tables, 1 algorithm.

Key Result

Theorem 5

Let $\boldsymbol{\lambda}\in \mathbb{R}^{m}$ be a dual variable, and $ub$ be an upper bound on the problem (P). Then, at any optimal solution of (P), it holds that

Figures (2)

  • Figure 1: Relationship between the problems.
  • Figure 2: Distribution of the number of fixed variables.

Theorems & Definitions (5)

  • Definition 1
  • Definition 2
  • Definition 3
  • Definition 4
  • Theorem 5