Table of Contents
Fetching ...

Learning Weakly Convex Sets in Metric Spaces

Eike Stadtländer, Tamás Horváth, Stefan Wrobel

TL;DR

A general domain-independent algorithm for finding consistent weakly convex hypotheses is proposed and it is proved that using this extended algorithm, the problem can be solved in polynomial time provided the distances in the domain can be computed efficiently.

Abstract

One of the central problems studied in the theory of machine learning is the question of whether, for a given class of hypotheses, it is possible to efficiently find a {consistent} hypothesis, i.e., which has zero training error. While problems involving {\em convex} hypotheses have been extensively studied, the question of whether efficient learning is possible for non-convex hypotheses composed of possibly several disconnected regions is still less understood. Although it has been shown quite a while ago that efficient learning of weakly convex hypotheses, a parameterized relaxation of convex hypotheses, is possible for the special case of Boolean functions, the question of whether this idea can be developed into a generic paradigm has not been studied yet. In this paper, we provide a positive answer and show that the consistent hypothesis finding problem can indeed be solved in polynomial time for a broad class of weakly convex hypotheses over metric spaces. To this end, we propose a general domain-independent algorithm for finding consistent weakly convex hypotheses and prove sufficient conditions for its efficiency that characterize the corresponding hypothesis classes. To illustrate our general algorithm and its properties, we discuss several non-trivial learning examples to demonstrate how it can be used to efficiently solve the corresponding consistent hypothesis finding problem. Without the weak convexity constraint, these problems are known to be computationally intractable. We then proceed to show that the general idea of our algorithm can even be extended to the case of extensional weakly convex hypotheses, as it naturally arise, e.g., when performing vertex classification in graphs. We prove that using our extended algorithm, the problem can be solved in polynomial time provided the distances in the domain can be computed efficiently.

Learning Weakly Convex Sets in Metric Spaces

TL;DR

A general domain-independent algorithm for finding consistent weakly convex hypotheses is proposed and it is proved that using this extended algorithm, the problem can be solved in polynomial time provided the distances in the domain can be computed efficiently.

Abstract

One of the central problems studied in the theory of machine learning is the question of whether, for a given class of hypotheses, it is possible to efficiently find a {consistent} hypothesis, i.e., which has zero training error. While problems involving {\em convex} hypotheses have been extensively studied, the question of whether efficient learning is possible for non-convex hypotheses composed of possibly several disconnected regions is still less understood. Although it has been shown quite a while ago that efficient learning of weakly convex hypotheses, a parameterized relaxation of convex hypotheses, is possible for the special case of Boolean functions, the question of whether this idea can be developed into a generic paradigm has not been studied yet. In this paper, we provide a positive answer and show that the consistent hypothesis finding problem can indeed be solved in polynomial time for a broad class of weakly convex hypotheses over metric spaces. To this end, we propose a general domain-independent algorithm for finding consistent weakly convex hypotheses and prove sufficient conditions for its efficiency that characterize the corresponding hypothesis classes. To illustrate our general algorithm and its properties, we discuss several non-trivial learning examples to demonstrate how it can be used to efficiently solve the corresponding consistent hypothesis finding problem. Without the weak convexity constraint, these problems are known to be computationally intractable. We then proceed to show that the general idea of our algorithm can even be extended to the case of extensional weakly convex hypotheses, as it naturally arise, e.g., when performing vertex classification in graphs. We prove that using our extended algorithm, the problem can be solved in polynomial time provided the distances in the domain can be computed efficiently.

Paper Structure

This paper contains 10 sections, 18 theorems, 31 equations, 4 figures, 1 table, 2 algorithms.

Key Result

Theorem 2

Let $\mathcal{C} \subseteq 2^X$ be a concept class over some domain $X$ with VC-dimension $d > 0$.

Figures (4)

  • Figure 1: Examples of $\theta$-convex sets in $\mathbb{R}^2$ for different values of $\theta$.
  • Figure 2: The geodesic $\theta$-convex hulls (in red) of a set of $40$ vertices for $\theta=8, 15, \text{ and } 114$ in a graph with $10,000$ vertices.
  • Figure 3: Precision, recall, and accuracy ($y$-axes) for various number of training examples ($x$-axes) for the balanced graphs with different graph sizes ($\lvert V\rvert$).
  • Figure 4: An exemplary Delaunay graph with $250$ vertices. On the left, the unknown target concept (depicted in red). It is the $\theta$-convex hull of the six generator vertices marked with black border for $\theta = 8$. The target concept is not convex; the convex hull of the generators contain the vertices enclosed by the black line. Notice that there is a negative point enclosed by three positive points in the lower part of the target concept. On the right, the figure shows the prediction of the hypothesis returned by our generic algorithm for the $40$ training examples marked with a black border. The image depicts true positives, true negatives, and false negatives. In this case, there were no false positives. The convex hull of the positive examples contain the vertices enclosed by the black line. In this example it is the same as the convex hull on the left.

Theorems & Definitions (36)

  • Theorem 2
  • Definition 3
  • Theorem 4
  • proof
  • Proposition 5
  • proof
  • Proposition 6
  • proof
  • Remark 7
  • Lemma 9
  • ...and 26 more