Table of Contents
Fetching ...

When Less is More: A Story of Failing Bayesian Optimization Due to Additional Expert Knowledge

Dorina Weichert, Gunar Ernis, Marvin Worthmann, Peter Ryzko, Lukas Seifert

TL;DR

The paper investigates optimizing recycled-plastic compound formulations with Bayesian Optimization under multiple constraints, revealing that adding expert-derived features can inadvertently increase dimensionality and hinder optimization. Through a sequence of experiments—from vanilla constrained BO and constraint relaxation to problem reformulation and a simplified-space BO—the authors demonstrate that a reduced four-parameter input space, coupled with a data-driven oracle based on historical experiments, can achieve performance comparable to or better than expert designs while minimizing distance to target properties. Key contributions include a detailed failure analysis of expert-informed BO, a practical simple BO approach, and guidelines for when to incorporate domain knowledge in industrial BO settings. The findings highlight the importance of problem formulation, feature selection, and adaptive constraint handling for real-world, resource-constrained materials design, with implications for accelerating experimental design in recycled-plastic development.

Abstract

The compounding of plastics with recycled material remains a practical challenge, as the properties of the processed material is not as easy to control as with completely new raw materials. For a data scientist, it makes sense to plan the necessary experiments in the development of new compounds using Bayesian Optimization, an optimization approach based on a surrogate model that is known for its data efficiency and is therefore well suited for data obtained from costly experiments. Furthermore, if historical data and expert knowledge are available, their inclusion in the surrogate model is expected to accelerate the convergence of the optimization. In this article, we describe a use case in which the addition of data and knowledge has impaired optimization. We also describe the unsuccessful methods that were used to remedy the problem before we found the reasons for the poor performance and achieved a satisfactory result. We conclude with a lesson learned: additional knowledge and data are only beneficial if they do not complicate the underlying optimization goal.

When Less is More: A Story of Failing Bayesian Optimization Due to Additional Expert Knowledge

TL;DR

The paper investigates optimizing recycled-plastic compound formulations with Bayesian Optimization under multiple constraints, revealing that adding expert-derived features can inadvertently increase dimensionality and hinder optimization. Through a sequence of experiments—from vanilla constrained BO and constraint relaxation to problem reformulation and a simplified-space BO—the authors demonstrate that a reduced four-parameter input space, coupled with a data-driven oracle based on historical experiments, can achieve performance comparable to or better than expert designs while minimizing distance to target properties. Key contributions include a detailed failure analysis of expert-informed BO, a practical simple BO approach, and guidelines for when to incorporate domain knowledge in industrial BO settings. The findings highlight the importance of problem formulation, feature selection, and adaptive constraint handling for real-world, resource-constrained materials design, with implications for accelerating experimental design in recycled-plastic development.

Abstract

The compounding of plastics with recycled material remains a practical challenge, as the properties of the processed material is not as easy to control as with completely new raw materials. For a data scientist, it makes sense to plan the necessary experiments in the development of new compounds using Bayesian Optimization, an optimization approach based on a surrogate model that is known for its data efficiency and is therefore well suited for data obtained from costly experiments. Furthermore, if historical data and expert knowledge are available, their inclusion in the surrogate model is expected to accelerate the convergence of the optimization. In this article, we describe a use case in which the addition of data and knowledge has impaired optimization. We also describe the unsuccessful methods that were used to remedy the problem before we found the reasons for the poor performance and achieved a satisfactory result. We conclude with a lesson learned: additional knowledge and data are only beneficial if they do not complicate the underlying optimization goal.

Paper Structure

This paper contains 21 sections, 3 figures.

Figures (3)

  • Figure 1: Values of quality metrics based on experiments by engineers.
  • Figure 2: Predictive performance of model on test data set, the root-means-squared-errors given in scaled space.
  • Figure 3: Course of experimentation for approaches. Run 4, with the reduced model is leads to more proposed experiments matching the constraints than in runs 2 and 3, as well as than the manual experimentation.