Table of Contents
Fetching ...

A DC-Reformulation for Gradient-$L^0$-Constrained Problems

Bastian Dittrich, Evelyn Herberg, Roland Herzog, Georg Müller

TL;DR

This work extends the DC-reformulation approach to problems with $L^0$-type cardinality constraints on the support of the gradients, i.e., problems where sparsity of the gradient and thus piecewise constant solutions are the target.

Abstract

Cardinality constraints in optimization are commonly of $L^0$-type, and they lead to sparsely supported optimizers. An efficient way of dealing with these constraints algorithmically, when the objective functional is convex, is reformulating the constraint using the difference of suitable $L^1$- and largest-$K$-norms and subsequently solving a sequence of penalized subproblems in the difference-of-convex (DC) class. We extend this DC-reformulation approach to problems with $L^0$-type cardinality constraints on the support of the gradients, i.e., problems where sparsity of the gradient and thus piecewise constant solutions are the target.

A DC-Reformulation for Gradient-$L^0$-Constrained Problems

TL;DR

This work extends the DC-reformulation approach to problems with -type cardinality constraints on the support of the gradients, i.e., problems where sparsity of the gradient and thus piecewise constant solutions are the target.

Abstract

Cardinality constraints in optimization are commonly of -type, and they lead to sparsely supported optimizers. An efficient way of dealing with these constraints algorithmically, when the objective functional is convex, is reformulating the constraint using the difference of suitable - and largest--norms and subsequently solving a sequence of penalized subproblems in the difference-of-convex (DC) class. We extend this DC-reformulation approach to problems with -type cardinality constraints on the support of the gradients, i.e., problems where sparsity of the gradient and thus piecewise constant solutions are the target.

Paper Structure

This paper contains 12 sections, 23 theorems, 110 equations, 5 figures, 3 tables, 1 algorithm.

Key Result

lemma 1

Both $\nabla (\cdot) _2^2 _1$ and $\nabla (\cdot) _2^2 _K$ are weakly sequentially continuous on $U$, and therefore so is $\phi$.

Figures (5)

  • Figure 4.1: Consider a feasible function $u_h \in U_h$ on the triangulated unit square that is constant exactly on the simplices marked in red in figure:subsets-of-vertices:a with the corresponding index set $\mathcal{I}_{\textup{c}} {\mathbf{u}}$. The quantity ${\mathbf{w}} {\mathbf{u}} _{0,\nu}$ is the area of the complement (white cells). \ref{['figure:subsets-of-vertices:b']} shows a connected set of four vertices $\IfNoValueTF{-NoValue-}{V^{(1)}}{ -NoValue-^{(-NoValue-)} }$ (blue circles). The simplices corresponding to $\sigma_{S \leftarrow V}( \IfNoValueTF{-NoValue-}{V^{(1)}}{ -NoValue-^{(-NoValue-)} } )$ are overlayed in blue. The set $\IfNoValueTF{-NoValue-}{V^{(1)}}{ -NoValue-^{(-NoValue-)} }$ covers one cell $\mathcal{I}_{\textup{cov}} \IfNoValueTF{-NoValue-}{V^{(1)}}{ -NoValue-^{(-NoValue-)} }$ hatched in blue. The cells in the intersection $\mathcal{I}_{\textup{c}} {\mathbf{u}} \cap \sigma_{S \leftarrow V}( \IfNoValueTF{-NoValue-}{V^{(1)}}{ -NoValue-^{(-NoValue-)} } ) \setminus \mathcal{I}_{\textup{cov}} \IfNoValueTF{-NoValue-}{V^{(1)}}{ -NoValue-^{(-NoValue-)} }$ are the ones in purple, without the hatched cell. The set $\IfNoValueTF{-NoValue-}{V^{(1)}}{ -NoValue-^{(-NoValue-)} }$ belongs to $\mathcal{V}({\mathbf{u}})$ if and only if the combined area of this intersection (purple, non-hatched) and the previously white cells does not exceed $K$. Note that $\IfNoValueTF{-NoValue-}{V^{(1)}}{ -NoValue-^{(-NoValue-)} }$ is not maximally connected, in contrast to the proof of theorem:B-stationarity:explicit-description. \ref{['figure:subsets-of-vertices:c']} shows a vertex set $\IfNoValueTF{-NoValue-}{V^{(2)}}{ -NoValue-^{(-NoValue-)} }$ consisting of a single vertex (blue circle). The simplices corresponding to $\sigma_{S \leftarrow V}( \IfNoValueTF{-NoValue-}{V^{(2)}}{ -NoValue-^{(-NoValue-)} } )$ are blue. No cells are covered by $\IfNoValueTF{-NoValue-}{V^{(2)}}{ -NoValue-^{(-NoValue-)} }$, hence $\mathcal{I}_{\textup{cov}} \IfNoValueTF{-NoValue-}{V^{(2)}}{ -NoValue-^{(-NoValue-)} } = \emptyset$. Because the colored regions do not overlap, there are no purple, non-hatched cells, i. e. we have $\mathcal{I}_{\textup{c}} {\mathbf{u}} \cap \sigma_{S \leftarrow V}( \IfNoValueTF{-NoValue-}{V^{(2)}}{ -NoValue-^{(-NoValue-)} } ) \setminus \mathcal{I}_{\textup{cov}} \IfNoValueTF{-NoValue-}{V^{(2)}}{ -NoValue-^{(-NoValue-)} } = \emptyset$, so $\IfNoValueTF{-NoValue-}{V^{(2)}}{ -NoValue-^{(-NoValue-)} } \in \mathcal{V}({\mathbf{u}})$. \ref{['figure:subsets-of-vertices:d']} shows a vertex set $\IfNoValueTF{-NoValue-}{V^{(3)}}{ -NoValue-^{(-NoValue-)} }$ (blue circles). The simplices corresponding to $\sigma_{S \leftarrow V}( \IfNoValueTF{-NoValue-}{V^{(3)}}{ -NoValue-^{(-NoValue-)} } )$ are overlayed blue. All cells corresponding to $\mathcal{I}_{\textup{c}} {\mathbf{u}}$ are in $\mathcal{I}_{\textup{cov}} \IfNoValueTF{-NoValue-}{V^{(3)}}{ -NoValue-^{(-NoValue-)} }$ (hatched). Again, there are no purple, non-hatched cells, thus $\IfNoValueTF{-NoValue-}{V^{(3)}}{ -NoValue-^{(-NoValue-)} } \in \mathcal{V}({\mathbf{u}})$.
  • Figure 5.1: Computed solution and $\lambda$, cf. \ref{['eq:general-problem:discrete:penalized:multiplier']}, for $1/h = 512$, without using a $K$-schedule (top row), and with $K$-schedule (bottom row). Triangles where the solution is constant in the sense of our threshold are indicated by a darker shade.
  • Figure 5.2: Computed solutions for $K = 1, 0.5, 0.05, 0.01$ (top left to bottom right). Triangles where the solution is constant in the sense of our threshold are indicated by a darker shade. Note that the plot range in the top row is six times as large as in the bottom row.
  • Figure 5.3: Value of $\psi \IfNoValueTF{{\mathbf{u}}}{(^{())}}{ {\mathbf{u}}^{(k)} }$ over the iterations for different values of $\rho$. After the stopping criterion \ref{['eq:stopping-criterion:alternative']} is satisfied, we let the algorithm execute another $100$ iterations.
  • Figure 5.4: Computed solutions for $\rho = 10^4, 10^5, 10^6, 10^7$ (top left to bottom right). Triangles where the solution is constant in the sense of our threshold are indicated by a darker shade.

Theorems & Definitions (50)

  • lemma 1
  • proof
  • theorem 1
  • proof
  • theorem 2: Convergence of global minimizers
  • proof
  • definition 1: Critical point, strongly critical point
  • theorem 3: Characterization of the subdifferential $\partial \IfNoValueTF{-NoValue-}{ W(\cdot) _{K}}{ W(\cdot) _{K,-NoValue-}}$
  • proof
  • corollary 1
  • ...and 40 more