Table of Contents
Fetching ...

Alternating Minimization for Regression with Tropical Rational Functions

Alex Dunbar, Lars Ruthotto

TL;DR

The paper tackles regression in the space of tropical rational functions by introducing an alternating minimization heuristic that alternates between fitting the numerator and denominator tropical polynomials, each step solvable in closed form via max-plus/min-plus matrix-vector operations. The approach leverages the structure of tropical polynomials to produce a computationally cheap, nonconvex optimization method for $\,\ell^{\infty}$ regression with fixed exponent set $W$. Empirical results across univariate, bivariate, and higher-dimensional data show the method yields reasonable fits and monotone (nonincreasing) loss across iterations, with insights into degree effects, scaling, and neural-network initialization. The work connects tropical algebra to ReLU networks, demonstrates potential benefits for initialization, and outlines avenues for extending the framework to different norms, sparsity, and monomial selection, while noting the lack of general convergence guarantees and the risk of overfitting in higher dimensions.

Abstract

We propose an alternating minimization heuristic for regression over the space of tropical rational functions with fixed exponents. The method alternates between fitting the numerator and denominator terms via tropical polynomial regression, which is known to admit a closed form solution. We demonstrate the behavior of the alternating minimization method experimentally. Experiments demonstrate that the heuristic provides a reasonable approximation of the input data. Our work is motivated by applications to ReLU neural networks, a popular class of network architectures in the machine learning community which are closely related to tropical rational functions.

Alternating Minimization for Regression with Tropical Rational Functions

TL;DR

The paper tackles regression in the space of tropical rational functions by introducing an alternating minimization heuristic that alternates between fitting the numerator and denominator tropical polynomials, each step solvable in closed form via max-plus/min-plus matrix-vector operations. The approach leverages the structure of tropical polynomials to produce a computationally cheap, nonconvex optimization method for regression with fixed exponent set . Empirical results across univariate, bivariate, and higher-dimensional data show the method yields reasonable fits and monotone (nonincreasing) loss across iterations, with insights into degree effects, scaling, and neural-network initialization. The work connects tropical algebra to ReLU networks, demonstrates potential benefits for initialization, and outlines avenues for extending the framework to different norms, sparsity, and monomial selection, while noting the lack of general convergence guarantees and the risk of overfitting in higher dimensions.

Abstract

We propose an alternating minimization heuristic for regression over the space of tropical rational functions with fixed exponents. The method alternates between fitting the numerator and denominator terms via tropical polynomial regression, which is known to admit a closed form solution. We demonstrate the behavior of the alternating minimization method experimentally. Experiments demonstrate that the heuristic provides a reasonable approximation of the input data. Our work is motivated by applications to ReLU neural networks, a popular class of network architectures in the machine learning community which are closely related to tropical rational functions.
Paper Structure (27 sections, 11 theorems, 51 equations, 8 figures, 1 table, 1 algorithm)

This paper contains 27 sections, 11 theorems, 51 equations, 8 figures, 1 table, 1 algorithm.

Key Result

Theorem 2.1

Given a dilation $h:\mathbb{T}^n \to \mathbb{T}^m$, there is a unique erosion $g:\mathbb{T}^m \to \mathbb{T}^n$ given by such that $(h,g)$ is an adjunction.

Figures (8)

  • Figure 1: The tropical hypersurfaces $\mathcal{V}(p)$ and $\mathcal{V}(q)$ and the nondifferentiability locus $X$ of $f = p-q$ in Example \ref{['ex:Nondiff_trop']}. The tropical hypersurfaces $\mathcal{V}(p)$ and $\mathcal{V}(q)$ divide $\mathbb{R}^2$ into polyhedral regions while $X$ divides $\mathbb{R}^2$ into regions which can be described as a finite union of polyhedra.
  • Figure 2: Results of applying Algorithm \ref{['alg:alt_fit']} with degree 15 tropical rational functions to noisy data from a sine curve. Figure \ref{['fig:sin_alternate_fit']} shows the training data, the approximation by a tropical rational function, and the function $\sin x$. The approximating function captures the general behavior of the dataset. Figure \ref{['fig:sin_alternate_convergence']} shows the $\ell_\infty$ error $e^k = \|\mathbf{X}\boxplus \mathbf{p}^k - \mathbf{X}\boxplus \mathbf{q}^k - \mathbf{y}\|_\infty$ and the update norm $\eta^k = \|\mathbf{p}^{k+1}\mathbf{q}^{k+1}^\top - \mathbf{p}^{k}\mathbf{q}^{k}^\top\|_\infty$. Both the training loss and the update norm are nonincreasing and contain intervals on which they are nearly constant.
  • Figure 3: Dependence of error and number of iterations on degree of tropical rational function fit to noisy data from a sine curve. The error decreases monotonoically as a function of degree with a large drop at degree 5. The number of iterations needed to reach the stopping criterion of $\eta^k \leq 10^{-12}$ generally increases with the degree.
  • Figure 4: Results of applying Algorithm \ref{['alg:alt_fit']} with degree $10$ and $31$ tropical rational functions to the peaks dataset. The resulting degree $31$ function sketches the general behavior of the dataset (Figure \ref{['fig:3131peaks']}), while the degree $10$ function fails to approximate the data (Figure \ref{['fig:1010peaks']}). Figures \ref{['fig:1010peakserror']} and \ref{['fig:3131peakserror']} display the the $\ell_\infty$ error $e^k = \|\mathbf{X}\boxplus \mathbf{p}^k - \mathbf{X}\boxplus \mathbf{q}^k - \mathbf{y}\|_\infty$ and the update norm $\eta^k = \|\mathbf{p}^{k+1}\mathbf{q}^{k+1}^\top - \mathbf{p}^{k}\mathbf{q}^{k}^\top\|_\infty$. For both degrees, the training loss and the update norm are each nonincreasing and contain intervals on which they are nearly constant.
  • Figure 5: Error in approximation to the peaks dataset when using a degree $(35,35)$ tropical rational function with inputs scaled by $c$. Here, the optimal value of $c$ is roughly 1.3 and gives a much lower training error than the optimal function with unscaled inputs.
  • ...and 3 more figures

Theorems & Definitions (20)

  • Example 2.1
  • Theorem 2.1: maragos2019tropical
  • Theorem 2.2: cuninghame-green_minimax_1979
  • proof
  • Theorem 2.3: zhang2018tropical
  • Corollary 2.3.1
  • Theorem 2.4: arora2018understanding
  • Proposition 3.1
  • proof
  • Proposition 3.2
  • ...and 10 more