Table of Contents
Fetching ...

(Near)-Optimal Algorithms for Sparse Separable Convex Integer Programs

Christoph Hunkenschröder, Martin Koutecký, Asaf Levin, Tung Anh Vu

TL;DR

The paper tackles the problem of optimizing a separable convex function over the integer points in a polyhedron defined by $A\mathbf{x}=\mathbf{b}$ and $\mathbf{l} \leq \mathbf{x} \leq \mathbf{u}$, focusing on block-structured matrices with small coefficients and low primal or dual treedepth. It develops two near-linear-time algorithms: a primal algorithm with time $g(td_P(A), \|A\|_\infty) \; n \log\max(\|\mathbf{u}-\mathbf{l}\|_\infty, \|\mathbf{b}\|_\infty)$ that matches the information-theoretic lower bound up to parameter dependencies, and a dual algorithm with time $g(td_D(A), \|A\|_\infty) \; n \log n \log\max(\|\mathbf{u}-\mathbf{l}\|_\infty, \|\mathbf{b}\|_\infty)$, conjectured to be optimal up to log factors. The approach blends scaling, proximity, and sensitivity analyses with a new convolution-tree dynamic data structure to enable sparse, fast updates, and it applies to $n$-fold, 2-stage, multi-stage, and tree-fold matrices. This work fills a gap between linear objective results and general separable convex objectives, offering efficient algorithms for a broad class of block-structured IPs with practical implications for optimization and statistical learning.

Abstract

We study the general integer programming (IP) problem of optimizing a separable convex function over the integer points of a polytope: $\min \{f(\mathbf{x}) \mid A\mathbf{x} = \mathbf{b}, \, \mathbf{l} \leq \mathbf{x} \leq \mathbf{u}, \, \mathbf{x} \in \mathbb{Z}^n\}$. The number of variables $n$ is a variable part of the input, and we consider the regime where the constraint matrix $A$ has small coefficients $\|A\|_\infty$ and small primal or dual treedepth $\mathrm{td}_P(A)$ or $\mathrm{td}_D(A)$, respectively. Equivalently, we consider block-structured matrices, in particular $n$-fold, tree-fold, $2$-stage and multi-stage matrices. We ask about the possibility of near-linear time algorithms in the general case of (non-linear) separable convex functions. The techniques of previous works for the linear case are inherently limited to it; in fact, no strongly-polynomial algorithm may exist due to a simple unconditional information-theoretic lower bound of $n \log \|\mathbf{u}-\mathbf{l}\|_\infty$, where $\mathbf{l}, \mathbf{u}$ are the vectors of lower and upper bounds. Our first result is that with parameters $\mathrm{td}_P(A)$ and $\|A\|_\infty$, this lower bound can be matched (up to dependency on the parameters). Second, with parameters $\mathrm{td}_D(A)$ and $\|A\|_\infty$, the situation is more involved, and we design an algorithm with time complexity $g(\mathrm{td}_D(A), \|A\|_\infty) n \log n \log \|\mathbf{u}-\mathbf{l}\|_\infty$ where $g$ is some computable function. We conjecture that a stronger lower bound is possible in this regime, and our algorithm is in fact optimal. Our algorithms combine ideas from scaling, proximity, and sensitivity of integer programs, together with a new dynamic data structure.

(Near)-Optimal Algorithms for Sparse Separable Convex Integer Programs

TL;DR

The paper tackles the problem of optimizing a separable convex function over the integer points in a polyhedron defined by and , focusing on block-structured matrices with small coefficients and low primal or dual treedepth. It develops two near-linear-time algorithms: a primal algorithm with time that matches the information-theoretic lower bound up to parameter dependencies, and a dual algorithm with time , conjectured to be optimal up to log factors. The approach blends scaling, proximity, and sensitivity analyses with a new convolution-tree dynamic data structure to enable sparse, fast updates, and it applies to -fold, 2-stage, multi-stage, and tree-fold matrices. This work fills a gap between linear objective results and general separable convex objectives, offering efficient algorithms for a broad class of block-structured IPs with practical implications for optimization and statistical learning.

Abstract

We study the general integer programming (IP) problem of optimizing a separable convex function over the integer points of a polytope: . The number of variables is a variable part of the input, and we consider the regime where the constraint matrix has small coefficients and small primal or dual treedepth or , respectively. Equivalently, we consider block-structured matrices, in particular -fold, tree-fold, -stage and multi-stage matrices. We ask about the possibility of near-linear time algorithms in the general case of (non-linear) separable convex functions. The techniques of previous works for the linear case are inherently limited to it; in fact, no strongly-polynomial algorithm may exist due to a simple unconditional information-theoretic lower bound of , where are the vectors of lower and upper bounds. Our first result is that with parameters and , this lower bound can be matched (up to dependency on the parameters). Second, with parameters and , the situation is more involved, and we design an algorithm with time complexity where is some computable function. We conjecture that a stronger lower bound is possible in this regime, and our algorithm is in fact optimal. Our algorithms combine ideas from scaling, proximity, and sensitivity of integer programs, together with a new dynamic data structure.

Paper Structure

This paper contains 20 sections, 17 theorems, 27 equations, 2 figures.

Key Result

theorem thmcountertheorem

There is a computable function $g$ and an algorithm which solves IP in time $g(\td_P(A), \|A\|_\infty) n \log(\max(\|\veu - \vel\|_\infty, \norm{\veb}_\infty))$.

Figures (2)

  • Figure 1: On the left a schematic depiction of a multi-stage stochastic matrix with three levels. On the right a schematic tree-fold matrix with 4 layers. All entries outside of the rectangles must be zero. Entries within rectangles can be non-zero.
  • Figure 2: The situation of Lemma \ref{['lem:prox-optima']}: the feasible region between lower and upper bounds is the light grey rectangle; the dark grey rectangle marks the region of vectors which, when translated to $\hat{\vez}$, are conformal to $\hat{\vex} - \hat{\vez}$. The picture makes it clear that $\veg$ and $\bar{\veg}$ are conformal to $\hat{\vex} - \hat{\vez}$, and that the identity $\hat{\vex} = \hat{\vez} + \veg + \bar{\veg}$ holds.

Theorems & Definitions (38)

  • theorem thmcountertheorem
  • theorem thmcountertheorem
  • lemma thmcounterlemma: Folklore, see monster
  • proposition thmcounterproposition: Separable convex superadditivity JesusBook
  • proposition thmcounterproposition
  • proof
  • definition thmcounterdefinition: Treedepth
  • definition thmcounterdefinition: Topological height
  • definition thmcounterdefinition: Dual Block-structured Matrix, MOR
  • definition thmcounterdefinition: Graver basis Graver:1975
  • ...and 28 more