Table of Contents
Fetching ...

Input Invex Neural Network

Suman Sapkota, Binod Bhattarai

TL;DR

Input Invex Neural Networks (II-NN) address the challenge of generating connected decision boundaries by enforcing invexity in neural models, ensuring simply connected lower contour sets. The work presents two concrete constructions: (i) Gradient Clipped Gradient Penalty (GC-GP), which constrains local input gradients via a gradient clipping and a smooth projected-gradient penalty, and (ii) a modular composition $f_{invex}(X) = f_{cone}(f_{invertible}(X))$ of an invertible backbone with a convex function. These methods enable binary and multi-region (multi-invex) classifiers with interpretable, locality-aware decisions and are validated on toy data and large-scale benchmarks (MNIST, Fashion-MNIST, CIFAR-10/100) where they achieve competitive accuracy relative to ordinary and convex baselines. The approach supports network morphism-based NAS for local region learning and offers a path toward more interpretable regions in input space, though GC-GP does not guarantee invexity in all cases and formal proofs remain an open area for future work. Overall, II-NN provides a principled framework for constructing interpretable, region-based classifiers by leveraging invexity and connected sets in neural networks.

Abstract

Connected decision boundaries are useful in several tasks like image segmentation, clustering, alpha-shape or defining a region in nD-space. However, the machine learning literature lacks methods for generating connected decision boundaries using neural networks. Thresholding an invex function, a generalization of a convex function, generates such decision boundaries. This paper presents two methods for constructing invex functions using neural networks. The first approach is based on constraining a neural network with Gradient Clipped-Gradient Penality (GCGP), where we clip and penalise the gradients. In contrast, the second one is based on the relationship of the invex function to the composition of invertible and convex functions. We employ connectedness as a basic interpretation method and create connected region-based classifiers. We show that multiple connected set based classifiers can approximate any classification function. In the experiments section, we use our methods for classification tasks using an ensemble of 1-vs-all models as well as using a single multiclass model on small-scale datasets. The experiments show that connected set-based classifiers do not pose any disadvantage over ordinary neural network classifiers, but rather, enhance their interpretability. We also did an extensive study on the properties of invex function and connected sets for interpretability and network morphism with experiments on toy and real-world data sets. Our study suggests that invex function is fundamental to understanding and applying locality and connectedness of input space which is useful for various downstream tasks.

Input Invex Neural Network

TL;DR

Input Invex Neural Networks (II-NN) address the challenge of generating connected decision boundaries by enforcing invexity in neural models, ensuring simply connected lower contour sets. The work presents two concrete constructions: (i) Gradient Clipped Gradient Penalty (GC-GP), which constrains local input gradients via a gradient clipping and a smooth projected-gradient penalty, and (ii) a modular composition of an invertible backbone with a convex function. These methods enable binary and multi-region (multi-invex) classifiers with interpretable, locality-aware decisions and are validated on toy data and large-scale benchmarks (MNIST, Fashion-MNIST, CIFAR-10/100) where they achieve competitive accuracy relative to ordinary and convex baselines. The approach supports network morphism-based NAS for local region learning and offers a path toward more interpretable regions in input space, though GC-GP does not guarantee invexity in all cases and formal proofs remain an open area for future work. Overall, II-NN provides a principled framework for constructing interpretable, region-based classifiers by leveraging invexity and connected sets in neural networks.

Abstract

Connected decision boundaries are useful in several tasks like image segmentation, clustering, alpha-shape or defining a region in nD-space. However, the machine learning literature lacks methods for generating connected decision boundaries using neural networks. Thresholding an invex function, a generalization of a convex function, generates such decision boundaries. This paper presents two methods for constructing invex functions using neural networks. The first approach is based on constraining a neural network with Gradient Clipped-Gradient Penality (GCGP), where we clip and penalise the gradients. In contrast, the second one is based on the relationship of the invex function to the composition of invertible and convex functions. We employ connectedness as a basic interpretation method and create connected region-based classifiers. We show that multiple connected set based classifiers can approximate any classification function. In the experiments section, we use our methods for classification tasks using an ensemble of 1-vs-all models as well as using a single multiclass model on small-scale datasets. The experiments show that connected set-based classifiers do not pose any disadvantage over ordinary neural network classifiers, but rather, enhance their interpretability. We also did an extensive study on the properties of invex function and connected sets for interpretability and network morphism with experiments on toy and real-world data sets. Our study suggests that invex function is fundamental to understanding and applying locality and connectedness of input space which is useful for various downstream tasks.

Paper Structure

This paper contains 58 sections, 3 theorems, 18 equations, 23 figures, 10 tables, 5 algorithms.

Key Result

Proposition 1

Let $f:\mathbf{X}\to\mathbb{R}$ and $g:\mathbf{X}\to\mathbb{R}$ be two functions on vector space $\mathbf{X}$. Let $\mathbf{x}\in \mathbf{X}$ be any point, $\mathbf{x^*}$ be the minima of $f$ and $\mathbf{x} \neq \mathbf{x^*}$. If $f$ be an invex function and If then $h(\mathbf{x}) = f(\mathbf{x})+g(\mathbf{x})$ is an invex function.

Figures (23)

  • Figure 1: Different types of sets according to the decision boundary in continuous space. (a) Convex sets have all the points inside a convex decision boundary. A straight line connecting any two points in the set also lies inside the set. Here, A is a convex set, and B is a non-convex set. (b) Connected sets have continuous space between any two points within the set. Any two points in the connected set can be connected by a curve that also lies inside the set. Here, both A and B are connected sets; A is a bounded 1-connected set. (c) Disconnected sets are opposite of connected sets. Any two points in the disconnected set can not be connected by a curve that also lies inside the set. Here, A is the disconnected set and B is the connected set. (d) The same decision boundary as (c) is represented by multiple connected sets. Disconnected set A in (c) is a union of connected set A and C in (d). Here, all A, B and C are connected sets. However, $A \cup C$ is a disconnected set and $B \cup C$ is still a connected set.
  • Figure 2: 3D plot (top row) and Contour plot (bottom row) of Quasi-Convex, Invex and Ordinary Function. The global minima (red star) is plotted in Convex and Invex Functions. Contour plots on different levels show the decision boundary made by each class of functions. Zoom in the diagram for details.
  • Figure 3: Classification of group of data points ($G_i$) in (a), by Ordinary Neural Networks in (b), and by Region-based classification method in (c). Here, (a) shows the input space of XOR type toy dataset where $G_0$ and $G_3$ belong to class $C_1$ and the rest to class $C_0$. (b) Shows two-stage classification by Ordinary Neural Networks, where the input space is transformed (folded) to a two-class cluster and separated by simple classifiers. (c) Shows a different approach with Region based classification. Each of the regions (connected sets) $R_i$ is assigned a class label. This approach still uses neural networks for non-linear morphing of the input space, however, it does not fold the space but only morphs it; the disconnected sets of the same class should be assigned to different regions.
  • Figure 4: Left: Pipeline for Basic II-NN with corresponding pseudo-code in Algorithm \ref{['algo:basic_iinn']}. Right: Function used for output gradient-clipping (GC) and projected gradient-penalty (GP) where, x-axis is the projected-gradient value.
  • Figure 5: Network Diagram of Multi-Invex Classifier. The Leaf nodes are either produced by a linear decision tree or by the nearest or linear classifier.
  • ...and 18 more figures

Theorems & Definitions (3)

  • Proposition 1
  • Proposition 2
  • Proposition 3