Table of Contents
Fetching ...

Regularized GLISp for sensor-guided human-in-the-loop optimization

Matteo Cercola, Michele Lomuscio, Dario Piga, Simone Formentin

TL;DR

The paper tackles the inefficiency of purely black-box preference-based optimization by introducing a sensor-guided regularized extension of GLISp. It defines a physics-informed hypothesis function $f_{hp}(\mathbf{x}) = \sum_{r=1}^{p} w_r J_r(\mathbf{x})$ and adds a regularization term to align the surrogate with this prior, jointly learning $\boldsymbol{\beta}$ and $\mathbf{w}$ with adaptive cross-validation to tune hyperparameters. Empirical results on an analytical benchmark and a human-in-the-loop vehicle suspension task show faster convergence, lower final error, and reduced variance compared to baseline GLISp, illustrating the benefits of integrating measurable descriptors into preference learning. The approach improves robustness and interpretability, highlights the potential of grey-box optimization in human-in-the-loop calibration, and opens avenues for gradient-based or locally adaptive priors to further enhance performance.

Abstract

Human-in-the-loop calibration is often addressed via preference-based optimization, where algorithms learn from pairwise comparisons rather than explicit cost evaluations. While effective, methods such as Preferential Bayesian Optimization or Global optimization based on active preference learning with radial basis functions (GLISp) treat the system as a black box and ignore informative sensor measurements. In this work, we introduce a sensor-guided regularized extension of GLISp that integrates measurable descriptors into the preference-learning loop through a physics-informed hypothesis function and a least-squares regularization term. This injects grey-box structure, combining subjective feedback with quantitative sensor information while preserving the flexibility of preference-based search. Numerical evaluations on an analytical benchmark and on a human-in-the-loop vehicle suspension tuning task show faster convergence and superior final solutions compared to baseline GLISp.

Regularized GLISp for sensor-guided human-in-the-loop optimization

TL;DR

The paper tackles the inefficiency of purely black-box preference-based optimization by introducing a sensor-guided regularized extension of GLISp. It defines a physics-informed hypothesis function and adds a regularization term to align the surrogate with this prior, jointly learning and with adaptive cross-validation to tune hyperparameters. Empirical results on an analytical benchmark and a human-in-the-loop vehicle suspension task show faster convergence, lower final error, and reduced variance compared to baseline GLISp, illustrating the benefits of integrating measurable descriptors into preference learning. The approach improves robustness and interpretability, highlights the potential of grey-box optimization in human-in-the-loop calibration, and opens avenues for gradient-based or locally adaptive priors to further enhance performance.

Abstract

Human-in-the-loop calibration is often addressed via preference-based optimization, where algorithms learn from pairwise comparisons rather than explicit cost evaluations. While effective, methods such as Preferential Bayesian Optimization or Global optimization based on active preference learning with radial basis functions (GLISp) treat the system as a black box and ignore informative sensor measurements. In this work, we introduce a sensor-guided regularized extension of GLISp that integrates measurable descriptors into the preference-learning loop through a physics-informed hypothesis function and a least-squares regularization term. This injects grey-box structure, combining subjective feedback with quantitative sensor information while preserving the flexibility of preference-based search. Numerical evaluations on an analytical benchmark and on a human-in-the-loop vehicle suspension tuning task show faster convergence and superior final solutions compared to baseline GLISp.

Paper Structure

This paper contains 12 sections, 6 equations, 3 figures.

Figures (3)

  • Figure 1: Sensor-guided regularized GLISp integrating quantitative sensor data into the preference-learning loop. User indicates the preferred setup, while domain expert defines the hypothesis function $f_{hp}$. New components w.r.t. baseline GLISp are highlighted in red.
  • Figure 2: Difference between the best achieved value $y_{\rm best}$ at each iteration and optimal value $y^*$: mean (solid line) and $\pm$ standard deviation (shaded region) over 10 Monte Carlo simulations. Baseline GLISp (orange) vs. regularized GLISp (blue).
  • Figure 3: Vehicle response in terms of vertical acceleration (top) and pitch rate (bottom) obtained using the parameters learned at the final iteration of the baseline (orange) and regularized (blue) GLISp. Solid lines indicate the mean response, and the shaded bands represent $\pm$ standard deviation over 10 Monte Carlo runs.