Table of Contents
Fetching ...

Deep Loss Convexification for Learning Iterative Models

Ziming Zhang, Yuping Shao, Yiqing Zhang, Fangzhou Lin, Haichong Zhang, Elke Rundensteiner

TL;DR

This paper proposes using star-convexity, a family of structured nonconvex functions that are unimodal on all lines that pass through a global minimizer, as a geometric constraint for reshaping loss landscapes, leading to extra novel hinge losses appended to the original loss and near-optimal predictions.

Abstract

Iterative methods such as iterative closest point (ICP) for point cloud registration often suffer from bad local optimality (e.g. saddle points), due to the nature of nonconvex optimization. To address this fundamental challenge, in this paper we propose learning to form the loss landscape of a deep iterative method w.r.t. predictions at test time into a convex-like shape locally around each ground truth given data, namely Deep Loss Convexification (DLC), thanks to the overparametrization in neural networks. To this end, we formulate our learning objective based on adversarial training by manipulating the ground-truth predictions, rather than input data. In particular, we propose using star-convexity, a family of structured nonconvex functions that are unimodal on all lines that pass through a global minimizer, as our geometric constraint for reshaping loss landscapes, leading to (1) extra novel hinge losses appended to the original loss and (2) near-optimal predictions. We demonstrate the state-of-the-art performance using DLC with existing network architectures for the tasks of training recurrent neural networks (RNNs), 3D point cloud registration, and multimodel image alignment.

Deep Loss Convexification for Learning Iterative Models

TL;DR

This paper proposes using star-convexity, a family of structured nonconvex functions that are unimodal on all lines that pass through a global minimizer, as a geometric constraint for reshaping loss landscapes, leading to extra novel hinge losses appended to the original loss and near-optimal predictions.

Abstract

Iterative methods such as iterative closest point (ICP) for point cloud registration often suffer from bad local optimality (e.g. saddle points), due to the nature of nonconvex optimization. To address this fundamental challenge, in this paper we propose learning to form the loss landscape of a deep iterative method w.r.t. predictions at test time into a convex-like shape locally around each ground truth given data, namely Deep Loss Convexification (DLC), thanks to the overparametrization in neural networks. To this end, we formulate our learning objective based on adversarial training by manipulating the ground-truth predictions, rather than input data. In particular, we propose using star-convexity, a family of structured nonconvex functions that are unimodal on all lines that pass through a global minimizer, as our geometric constraint for reshaping loss landscapes, leading to (1) extra novel hinge losses appended to the original loss and (2) near-optimal predictions. We demonstrate the state-of-the-art performance using DLC with existing network architectures for the tasks of training recurrent neural networks (RNNs), 3D point cloud registration, and multimodel image alignment.

Paper Structure

This paper contains 17 sections, 3 theorems, 14 equations, 10 figures, 5 tables, 1 algorithm.

Key Result

Lemma 1

The following conditions hold iff a function $f:\mathbb{R}^n\rightarrow\mathbb{R}$ is $\mu$-strongly star-convex, given a global minimum $\omega^*\in\mathbb{R}^n$ and $\forall \lambda\in[0,1], \forall \omega\in\mathbb{R}^n$: where $\Tilde{\omega} = (1-\lambda) \omega^* + \lambda \omega$.

Figures (10)

  • Figure 1: Illustration of iterative point cloud registration with the expectation of loss decrease on the landscape, where each $M$ denotes an affine transformation.
  • Figure 2: Illustration of differences in loss landscapes (a) without and (b) with convexification w.r.t. $\omega$, given data $x$.
  • Figure 3: Performance comparison in both training and testing for LSTM on Pixel-MNIST with and without our DLC.
  • Figure 4: Test-time loss landscape comparison using PRNet on ModelNet40: (a,c) rotation matrices, and (b,d) translations.
  • Figure 5: Illustration of test-time point cloud registration for (top) a successful case and (bottom) a failure case using DLC+PRNet.
  • ...and 5 more figures

Theorems & Definitions (8)

  • Definition 1: Star-Convexity lee2016optimizing
  • Definition 2: Strong Star-Convexity pmlr-v125-hinder20a
  • Lemma 1
  • proof
  • Lemma 2
  • proof
  • Lemma 3
  • proof