Smooth Min-Max Monotonic Networks

Christian Igel

Smooth Min-Max Monotonic Networks

Christian Igel

TL;DR

Monotonicity constraints are valuable for plausibility and fairness, but MM networks suffer from training difficulties due to silent neurons and zero gradients. The authors introduce Smooth Min-Max (SMM) networks by replacing max/min with smooth LogSumExp operators, yielding differentiable, end-to-end trainable monotone models that preserve MM's asymptotic approximation guarantees. Empirical results across univariate, multivariate, and real-world partial-monotone tasks show SMM achieving state-of-the-art or competitive generalization with lower complexity and robust training dynamics. Overall, SMM serves as a drop-in monotone module that combines simplicity, efficiency, and strong empirical performance, challenging the need for more complex monotone architectures. The work highlights the practical impact of smooth monotone designs for scientific modelling and responsible AI applications.

Abstract

Monotonicity constraints are powerful regularizers in statistical modelling. They can support fairness in computer-aided decision making and increase plausibility in data-driven scientific models. The seminal min-max (MM) neural network architecture ensures monotonicity, but often gets stuck in undesired local optima during training because of partial derivatives of the MM nonlinearities being zero. We propose a simple modification of the MM network using strictly-increasing smooth minimum and maximum functions that alleviates this problem. The resulting smooth min-max (SMM) network module inherits the asymptotic approximation properties from the MM architecture. It can be used within larger deep learning systems trained end-to-end. The SMM module is conceptually simple and computationally less demanding than state-of-the-art neural networks for monotonic modelling. Our experiments show that this does not come with a loss in generalization performance compared to alternative neural and non-neural approaches.

Smooth Min-Max Monotonic Networks

TL;DR

Abstract

Paper Structure (25 sections, 2 theorems, 10 equations, 5 figures, 9 tables)

This paper contains 25 sections, 2 theorems, 10 equations, 5 figures, 9 tables.

Introduction
Background
Neural Networks with Positive Weights
Basic theoretical results.
Min-max networks.
Related Work
Lattice layers.
Certified monotonic neural networks.
Lipschitz monotonic networks.
Constrained monotonic neural networks.
Non-neural approaches.
Smooth Monotonic Networks
Approximation Properties
Partial Monotonic SMM
Experiments
...and 10 more sections

Key Result

Theorem 1

Let $f(\boldsymbol{x})$ be any continuous, bounded monotonic function with bounded partial derivatives, mapping $[0,1]^d$ to $\mathbb R$. Then there exists a function $f_{\text{\normalfont net}}(\boldsymbol{x})$ which can be implemented by a monotonic network such that $|f(\boldsymbol{x})-f_{\text{\

Figures (5)

Figure 1: Learning an allometric equation from data with an original min-max network (MM), XGBoost (XG) and a smooth min-max network (SMM), here estimating wood dry mass (and thereby stored carbon) from tree crown area hiernaux:23tucker:23.
Figure 2: Schema of a min-max module.
Figure 3.3: Results on univariate functions based on $T=21$ trials. Depicted are the median, first and third quartile of the MSE (without clipping the outputs to the target function codomain); the whiskers extend the box by $11/2$ the inter-quartile range, dots are outliers. Training errors are shown in the top, test errors in the bottom row.
Figure 3.4: Function approximation results of a single trial (outputs not clipped) for each of the three univariate functions. The top row shows the non-neural, the bottom row the neural methods.
Figure 3.5: Results on multivariate functions based on $T=21$ trials. Depicted are the median, first and third quartile of the MSE; the whiskers extend the box by $11/2$ the inter-quartile range, dots are outliers. Early-stopping reduced the XGBoost training accuracy but did not lead to an improvement on the test data.

Theorems & Definitions (3)

Theorem 1: sill:97daniels:2010
Corollary 1
proof

Smooth Min-Max Monotonic Networks

TL;DR

Abstract

Smooth Min-Max Monotonic Networks

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (5)

Theorems & Definitions (3)