Table of Contents
Fetching ...

Global optimality under amenable symmetry constraints

Peter Orbanz

Abstract

Consider a convex function that is invariant under an group of transformations. If it has a minimizer, does it also have an invariant minimizer? Variants of this problem appear in nonparametric statistics and in a number of adjacent fields. The answer depends on the choice of function, and on what one may loosely call the geometry of the problem -- the interplay between convexity, the group, and the underlying vector space, which is typically infinite-dimensional. We observe that this geometry is completely encoded in the smallest closed convex invariant subsets of the space, and proceed to study these sets, for groups that are amenable but not necessarily compact. We then apply this toolkit to the invariant optimality problem. It yields new results on invariant kernel mean embeddings and risk-optimal invariant couplings, and clarifies relations between seemingly distinct ideas, such as the summation trick used in machine learning to construct equivariant neural networks and the classic Hunt-Stein theorem of statistics.

Global optimality under amenable symmetry constraints

Abstract

Consider a convex function that is invariant under an group of transformations. If it has a minimizer, does it also have an invariant minimizer? Variants of this problem appear in nonparametric statistics and in a number of adjacent fields. The answer depends on the choice of function, and on what one may loosely call the geometry of the problem -- the interplay between convexity, the group, and the underlying vector space, which is typically infinite-dimensional. We observe that this geometry is completely encoded in the smallest closed convex invariant subsets of the space, and proceed to study these sets, for groups that are amenable but not necessarily compact. We then apply this toolkit to the invariant optimality problem. It yields new results on invariant kernel mean embeddings and risk-optimal invariant couplings, and clarifies relations between seemingly distinct ideas, such as the summation trick used in machine learning to construct equivariant neural networks and the classic Hunt-Stein theorem of statistics.
Paper Structure (41 sections, 160 equations, 3 figures)

This paper contains 41 sections, 160 equations, 3 figures.

Figures (3)

  • Figure 1: Finite-dimensional Følner averages in the (trivial) case where $\mathbb{G}$ is compact. Here, the rotation group $\mathbb{G}$ acts on ${X=\mathbb{R}^2}$. Left: A point $x$ and its orbitope (the gray disc). Middle left/right: The average $\mathbf{F}_n(x)$ is the barycenter of the uniform distribution on the black line segment $\mathbf{A}_n(x)$. It is not an element of the orbit $\mathbb{G}(x)$, but is in $\Pi(x)$. Right: If ${\mathbf{A}_n=\mathbb{G}}$ the barycenter $\mathbf{F}_n(x)$ is $\mathbb{G}$-invariant.
  • Figure 2: Finite-dimensional illustration of some of the sets in \ref{['theorem:day']}. Left: A compact set $K$ in which $x$ is extreme. $K$ is invariant under rotations around its middle axis, and $K_\mathbb{G}$ is the intersection of this axis and $K$. Middle: The orbitope $\Pi(x)$ is a closed disc that contains a single invariant element ${\bar{x}}$. Right: A compact set invariant under reflections over the vertical axis. The orbitope of $x$ is the intersection of $K$ with the horizontal line through $x$. Even though $x$ is an extreme of $K$, $\bar{x}$ is not.
  • Figure 3: Orbitopes of coin flip pairs. Left: The set $\mathcal{P}$ of distributions on ${\lbrace 0,1 \rbrace}^2$ is the convex hull of its four point masses, and can be identified with a subset of ${{\mathbb{R}^3}}$. Middle left: The set $\mathcal{P}_\mathbb{G}$ of permutation-invariant distributions is a convex subset of $\mathcal{P}$. The orbits of $\mathbb{G}$ in ${{\lbrace 0,1 \rbrace}^2}$ are the sets ${\lbrace 00 \rbrace}$, ${\lbrace 11 \rbrace}$, and ${\lbrace 01,10 \rbrace}$, and the extreme points of $\mathcal{P}_\mathbb{G}$ are the uniform distributions on these orbits. Middle right: Orbitopes of point masses are faces of $\mathcal{P}$ (cf. \ref{['lemma:cid:orbitope:pointmass']}), here two singletons and an edge. Middle right: The orbitope of a measure $P$ in the interior.

Theorems & Definitions (53)

  • Example 1
  • Example 2
  • proof
  • Example 4
  • proof
  • Remark
  • proof
  • proof
  • proof
  • proof
  • ...and 43 more