Table of Contents
Fetching ...

Using Distance Correlation for Efficient Bayesian Optimization

Takuya Kanazawa

TL;DR

This paper introduces Bayesian inference with Distance Correlation (BDC), a hyperparameter-free Bayesian optimization framework that uses the nonparametric dependence measure DC, which takes values in $[0,1]$ and vanishes iff independence, to guide sequential querying. It applies DC to two settings: (i) global probing with integral observations at adaptive widths for efficient terrain mapping, and (ii) standard BO with point evaluations for maximization, achieving performance comparable to EI and MES across benchmarks. The approach supports adaptive, information-driven decisions without manual tuning and demonstrates robust performance on synthetic and real data, including remote-sensing-like tasks. The work has practical implications for costly measurements in domains such as medical imaging, geosciences, and hyperparameter tuning, and suggests directions for batch extensions and theoretical guarantees.

Abstract

The need to collect data via expensive measurements of black-box functions is prevalent across science, engineering and medicine. As an example, hyperparameter tuning of a large AI model is critical to its predictive performance but is generally time-consuming and unwieldy. Bayesian optimization (BO) is a collection of methods that aim to address this issue by means of Bayesian statistical inference. In this work, we put forward a BO scheme named BDC, which integrates BO with a statistical measure of association of two random variables called Distance Correlation. BDC balances exploration and exploitation automatically, and requires no manual hyperparameter tuning. We evaluate BDC on a range of benchmark tests and observe that it performs on per with popular BO methods such as the expected improvement and max-value entropy search. We also apply BDC to optimization of sequential integral observations of an unknown terrain and confirm its utility.

Using Distance Correlation for Efficient Bayesian Optimization

TL;DR

This paper introduces Bayesian inference with Distance Correlation (BDC), a hyperparameter-free Bayesian optimization framework that uses the nonparametric dependence measure DC, which takes values in and vanishes iff independence, to guide sequential querying. It applies DC to two settings: (i) global probing with integral observations at adaptive widths for efficient terrain mapping, and (ii) standard BO with point evaluations for maximization, achieving performance comparable to EI and MES across benchmarks. The approach supports adaptive, information-driven decisions without manual tuning and demonstrates robust performance on synthetic and real data, including remote-sensing-like tasks. The work has practical implications for costly measurements in domains such as medical imaging, geosciences, and hyperparameter tuning, and suggests directions for batch extensions and theoretical guarantees.

Abstract

The need to collect data via expensive measurements of black-box functions is prevalent across science, engineering and medicine. As an example, hyperparameter tuning of a large AI model is critical to its predictive performance but is generally time-consuming and unwieldy. Bayesian optimization (BO) is a collection of methods that aim to address this issue by means of Bayesian statistical inference. In this work, we put forward a BO scheme named BDC, which integrates BO with a statistical measure of association of two random variables called Distance Correlation. BDC balances exploration and exploitation automatically, and requires no manual hyperparameter tuning. We evaluate BDC on a range of benchmark tests and observe that it performs on per with popular BO methods such as the expected improvement and max-value entropy search. We also apply BDC to optimization of sequential integral observations of an unknown terrain and confirm its utility.

Paper Structure

This paper contains 18 sections, 6 equations, 8 figures, 2 tables, 2 algorithms.

Figures (8)

  • Figure 1: Two samples of randomly generated multi-modal functions over the unit interval.
  • Figure 2: Performance of BDC and the two baseline methods. The error bands represent the standard deviation of the mean, i.e., the sample standard deviation divided by $\sqrt{64}=8$.
  • Figure 4: Performance of BDC and the two baseline methods for semi-local gradient observations. The error bands represent the standard deviation of the mean, i.e., the sample standard deviation divided by $\sqrt{48}\simeq 6.93$.
  • Figure 6: Digital elevation model of the Grand Canyon in the United States USGS.
  • Figure 7: Performance of BDC and the two baseline methods for evaluating the Grand Canyon data. The error bands represent the standard deviation of the mean, i.e., the sample standard deviation divided by $\sqrt{16}=4$.
  • ...and 3 more figures