Table of Contents
Fetching ...

DisCo-DSO: Coupling Discrete and Continuous Optimization for Efficient Generative Design in Hybrid Spaces

Jacob F. Pettit, Chak Shing Lee, Jiachen Yang, Alex Ho, Daniel Faissol, Brenden Petersen, Mikel Landajuela

TL;DR

DisCo-DSO tackles black-box optimization in hybrid discrete-continuous, variable-length spaces by learning a joint distribution over complete designs via an autoregressive model. It extends discrete-continuous optimization to prefix-constrained, variable-length sequences and trains with a risk-seeking policy gradient, achieving one-evaluation-per-solution efficiency. Across parameterized bitstrings, decision-tree policies for RL, and symbolic regression, it outperforms decoupled baselines and, in many cases, state-of-the-art methods, especially as problem complexity increases. This approach enables more sample-efficient, interpretable, and scalable optimization in hybrid spaces with non-differentiable rewards, offering practical impact for RL with interpretable policies and equation discovery.

Abstract

We consider the challenge of black-box optimization within hybrid discrete-continuous and variable-length spaces, a problem that arises in various applications, such as decision tree learning and symbolic regression. We propose DisCo-DSO (Discrete-Continuous Deep Symbolic Optimization), a novel approach that uses a generative model to learn a joint distribution over discrete and continuous design variables to sample new hybrid designs. In contrast to standard decoupled approaches, in which the discrete and continuous variables are optimized separately, our joint optimization approach uses fewer objective function evaluations, is robust against non-differentiable objectives, and learns from prior samples to guide the search, leading to significant improvement in performance and sample efficiency. Our experiments on a diverse set of optimization tasks demonstrate that the advantages of DisCo-DSO become increasingly evident as the complexity of the problem increases. In particular, we illustrate DisCo-DSO's superiority over the state-of-the-art methods for interpretable reinforcement learning with decision trees.

DisCo-DSO: Coupling Discrete and Continuous Optimization for Efficient Generative Design in Hybrid Spaces

TL;DR

DisCo-DSO tackles black-box optimization in hybrid discrete-continuous, variable-length spaces by learning a joint distribution over complete designs via an autoregressive model. It extends discrete-continuous optimization to prefix-constrained, variable-length sequences and trains with a risk-seeking policy gradient, achieving one-evaluation-per-solution efficiency. Across parameterized bitstrings, decision-tree policies for RL, and symbolic regression, it outperforms decoupled baselines and, in many cases, state-of-the-art methods, especially as problem complexity increases. This approach enables more sample-efficient, interpretable, and scalable optimization in hybrid spaces with non-differentiable rewards, offering practical impact for RL with interpretable policies and equation discovery.

Abstract

We consider the challenge of black-box optimization within hybrid discrete-continuous and variable-length spaces, a problem that arises in various applications, such as decision tree learning and symbolic regression. We propose DisCo-DSO (Discrete-Continuous Deep Symbolic Optimization), a novel approach that uses a generative model to learn a joint distribution over discrete and continuous design variables to sample new hybrid designs. In contrast to standard decoupled approaches, in which the discrete and continuous variables are optimized separately, our joint optimization approach uses fewer objective function evaluations, is robust against non-differentiable objectives, and learns from prior samples to guide the search, leading to significant improvement in performance and sample efficiency. Our experiments on a diverse set of optimization tasks demonstrate that the advantages of DisCo-DSO become increasingly evident as the complexity of the problem increases. In particular, we illustrate DisCo-DSO's superiority over the state-of-the-art methods for interpretable reinforcement learning with decision trees.

Paper Structure

This paper contains 59 sections, 13 equations, 11 figures, 14 tables, 1 algorithm.

Figures (11)

  • Figure 1: Comparison of the standard decoupled approach and DisCo-DSO for discrete-continuous optimization using an autoregressive model. In the decoupled approach, the discrete skeleton $\tau_{\text{d}} = \langle (l_1, \cdot), \ldots, (l_T, \cdot) \rangle$ is sampled first and then the continuous parameters $\beta_1, \ldots, \beta_T$ are optimized independently. In contrast, DisCo-DSO models the joint distribution over the sequence of tokens $\langle (l_1, \beta_1), \ldots, (l_T, \beta_T)\rangle$. Here, the notation $\oplus$ stands for concatenation of vectors.
  • Figure 2: Reward of best solution versus number of function evaluations on a parameterized bitstring task, for two continuous optimization landscapes $f_1$ and $f_2$ and weights $\alpha = 0.5, 0.9$. Solid line corresponds to weight $\alpha=0.9$, dashed line $\alpha=0.5$. Mean and standard error over 5 seeds.
  • Figure 3: Left: the decision tree associated with the traversal $\langle x_1<2, a_2, x_2 < 6, x_1 < 3, a_1, a_3, a_2 \rangle$. Right: the corresponding bounds for the parameters during the sampling process (suppose the bounds for observations $x_1$ and $x_2$ are respectively [0, 5] and [1, 8]).
  • Figure 4: Reward of the best solution versus number of function evaluations on the decision tree policy task, for Acrobot-v1 and LunarLander-v2.
  • Figure 5: Best decision trees found by DisCo-DSO on the decision tree policy tasks for Acrobot-v1 and LunarLander-v2.
  • ...and 6 more figures