Table of Contents
Fetching ...

A local approach to parameter space reduction for regression and classification tasks

Francesco Romor, Marco Tezzele, Gianluigi Rozza

TL;DR

This work addresses the curse of dimensionality in regression and classification by introducing Local Active Subspaces (LAS), a framework that merges global active-subspace analysis with supervised clustering to reveal locally low-dimensional input manifolds. The method constructs local ridge surrogates within subregions defined by AS-informed partitions, and employs three clustering strategies—K-means, K-medoids with an AS-induced metric, and hierarchical top-down clustering (HAS)—to adaptively refine the parameter space and automatically select local AS dimensions. The authors provide theoretical bounds for local ridge approximations, detail a classification approach for local AS dimensions, and demonstrate substantial performance gains on a range of scalar and vector-valued problems, including CFD-related tests and epidemiological models. The approach enables more accurate and efficient surrogate modeling by exploiting regional intrinsic dimensionality, with broad applicability to high-dimensional regression, inverse problems, and uncertainty quantification in engineering.

Abstract

Parameter space reduction has been proved to be a crucial tool to speed-up the execution of many numerical tasks such as optimization, inverse problems, sensitivity analysis, and surrogate models' design, especially when in presence of high-dimensional parametrized systems. In this work we propose a new method called local active subspaces (LAS), which explores the synergies of active subspaces with supervised clustering techniques in order to carry out a more efficient dimension reduction in the parameter space. The clustering is performed without losing the input-output relations by introducing a distance metric induced by the global active subspace. We present two possible clustering algorithms: K-medoids and a hierarchical top-down approach, which is able to impose a variety of subdivision criteria specifically tailored for parameter space reduction tasks. This method is particularly useful for the community working on surrogate modelling. Frequently, the parameter space presents subdomains where the objective function of interest varies less on average along different directions. So, it could be approximated more accurately if restricted to those subdomains and studied separately. We tested the new method over several numerical experiments of increasing complexity, we show how to deal with vectorial outputs, and how to classify the different regions with respect to the local active subspace dimension. Employing this classification technique as a preprocessing step in the parameter space, or output space in case of vectorial outputs, brings remarkable results for the purpose of surrogate modelling.

A local approach to parameter space reduction for regression and classification tasks

TL;DR

This work addresses the curse of dimensionality in regression and classification by introducing Local Active Subspaces (LAS), a framework that merges global active-subspace analysis with supervised clustering to reveal locally low-dimensional input manifolds. The method constructs local ridge surrogates within subregions defined by AS-informed partitions, and employs three clustering strategies—K-means, K-medoids with an AS-induced metric, and hierarchical top-down clustering (HAS)—to adaptively refine the parameter space and automatically select local AS dimensions. The authors provide theoretical bounds for local ridge approximations, detail a classification approach for local AS dimensions, and demonstrate substantial performance gains on a range of scalar and vector-valued problems, including CFD-related tests and epidemiological models. The approach enables more accurate and efficient surrogate modeling by exploiting regional intrinsic dimensionality, with broad applicability to high-dimensional regression, inverse problems, and uncertainty quantification in engineering.

Abstract

Parameter space reduction has been proved to be a crucial tool to speed-up the execution of many numerical tasks such as optimization, inverse problems, sensitivity analysis, and surrogate models' design, especially when in presence of high-dimensional parametrized systems. In this work we propose a new method called local active subspaces (LAS), which explores the synergies of active subspaces with supervised clustering techniques in order to carry out a more efficient dimension reduction in the parameter space. The clustering is performed without losing the input-output relations by introducing a distance metric induced by the global active subspace. We present two possible clustering algorithms: K-medoids and a hierarchical top-down approach, which is able to impose a variety of subdivision criteria specifically tailored for parameter space reduction tasks. This method is particularly useful for the community working on surrogate modelling. Frequently, the parameter space presents subdomains where the objective function of interest varies less on average along different directions. So, it could be approximated more accurately if restricted to those subdomains and studied separately. We tested the new method over several numerical experiments of increasing complexity, we show how to deal with vectorial outputs, and how to classify the different regions with respect to the local active subspace dimension. Employing this classification technique as a preprocessing step in the parameter space, or output space in case of vectorial outputs, brings remarkable results for the purpose of surrogate modelling.

Paper Structure

This paper contains 24 sections, 2 theorems, 41 equations, 12 figures, 1 table, 4 algorithms.

Key Result

Theorem 1

The solution $P_{r}$ of the ridge approximation problem in def:ridge_pb, with optimal profile $\Tilde{h}=\mathbb{E}_{\boldsymbol{\mu}}[f\vert P_{r}]$, is the orthogonal projector to the eigenspace of the first $r$-eigenvalues of $\Sigma$ ordered by magnitude with $r\in\mathbb{N}$ chosen such that with $C(C_{P}, \tau)$ a constant depending on $\tau>0$ related to the choice of $\boldsymbol{\mu}$ a

Figures (12)

  • Figure 1: On the left panel the contour plot of the quartic function and in orange the global active subspace direction. On the right panel the sufficient summary plot resulting projecting the data onto the global AS.
  • Figure 2: Comparison between the different clusters obtained by K-means (on the left), K-medoids (middle panel), and hierarchical top-down (on the right) with AS induced distance metric defined in \ref{['eq:as_norm']} for the quartic test function. In orange the global active subspace direction. Every cluster is depicted in a different color.
  • Figure 3: Local sufficient summary plots for the $4$ clusters individuated by K-medoids or hierarchical top-down in \ref{['fig:quartic_4_clusters']} (colors correspond).
  • Figure 4: $R^2$ scores comparison between local versions varying the number of clusters for the quartic function. Global AS has a score equal to $0.78$.
  • Figure 5: On the left panel the hierarchical top-down clustering with heterogeneous AS dimension and $R^2$ score equal to $1$. On the right panel the labels of the local AS dimension from \ref{['def:local as\ndimension']}.
  • ...and 7 more figures

Theorems & Definitions (15)

  • Definition 1: Ridge approximation
  • Theorem 1: Definition of AS through ridge approximation
  • Definition 2: Local ridge approximation with active subspaces
  • Remark 1: Relationships between the upper bounds of consecutive refinements
  • Corollary 1: Counterexample for indefinite refinement as optimal clustering criterion
  • proof
  • Remark 2: Approximation of the optimal profile
  • Remark 3: Estimator based on local $R^2$ scores
  • Remark 4: Normalization of the clusters at each refinement iteration
  • Remark 5: Heuristics behind the choice of the active subspaces metric for K-medoids
  • ...and 5 more