Table of Contents
Fetching ...

Constraining Generative Models for Engineering Design with Negative Data

Lyle Regenwetter, Giorgio Giannone, Akash Srivastava, Dan Gutfreund, Faez Ahmed

TL;DR

This work tackles constraint violations in generative models for engineering design by introducing negative-data generative models (NDGMs). It proposes GAN-MDD, which uses a multi-class discriminator to learn pairwise density ratios among positive, negative, and fake data and adds a determinantal-point-process-based diversity loss to mitigate mode collapse. Across extensive 2D benchmarks and a dozen real engineering tasks, GAN-MDD achieves superior trade-offs between constraint satisfaction and distributional fidelity, often with significantly greater data efficiency than vanilla models. The results show NDGMs can dramatically reduce invalid designs (e.g., by factors like 1/6 with 1/8 data) and improve diversity, suggesting substantial practical impact for constrained design optimization and topology problems. The work also discusses negative data quality, rejection sampling as a benchmark, and limitations, paving the way for broader adoption of NDGMs in engineering and related constrained domains.

Abstract

Generative models have recently achieved remarkable success and widespread adoption in society, yet they often struggle to generate realistic and accurate outputs. This challenge extends beyond language and vision into fields like engineering design, where safety-critical engineering standards and non-negotiable physical laws tightly constrain what outputs are considered acceptable. In this work, we introduce a novel training method to guide a generative model toward constraint-satisfying outputs using `negative data' -- examples of what to avoid. Our negative-data generative model (NDGM) formulation easily outperforms classic models, generating 1/6 as many constraint-violating samples using 1/8 as much data in certain problems. It also consistently outperforms other baselines, achieving a balance between constraint satisfaction and distributional similarity that is unsurpassed by any other model in 12 of the 14 problems tested. This widespread superiority is rigorously demonstrated across numerous synthetic tests and real engineering problems, such as ship hull synthesis with hydrodynamic constraints and vehicle design with impact safety constraints. Our benchmarks showcase both the best-in-class performance of our new NDGM formulation and the overall dominance of NDGMs versus classic generative models. We publicly release the code and benchmarks at https://github.com/Lyleregenwetter/NDGMs.

Constraining Generative Models for Engineering Design with Negative Data

TL;DR

This work tackles constraint violations in generative models for engineering design by introducing negative-data generative models (NDGMs). It proposes GAN-MDD, which uses a multi-class discriminator to learn pairwise density ratios among positive, negative, and fake data and adds a determinantal-point-process-based diversity loss to mitigate mode collapse. Across extensive 2D benchmarks and a dozen real engineering tasks, GAN-MDD achieves superior trade-offs between constraint satisfaction and distributional fidelity, often with significantly greater data efficiency than vanilla models. The results show NDGMs can dramatically reduce invalid designs (e.g., by factors like 1/6 with 1/8 data) and improve diversity, suggesting substantial practical impact for constrained design optimization and topology problems. The work also discusses negative data quality, rejection sampling as a benchmark, and limitations, paving the way for broader adoption of NDGMs in engineering and related constrained domains.

Abstract

Generative models have recently achieved remarkable success and widespread adoption in society, yet they often struggle to generate realistic and accurate outputs. This challenge extends beyond language and vision into fields like engineering design, where safety-critical engineering standards and non-negotiable physical laws tightly constrain what outputs are considered acceptable. In this work, we introduce a novel training method to guide a generative model toward constraint-satisfying outputs using `negative data' -- examples of what to avoid. Our negative-data generative model (NDGM) formulation easily outperforms classic models, generating 1/6 as many constraint-violating samples using 1/8 as much data in certain problems. It also consistently outperforms other baselines, achieving a balance between constraint satisfaction and distributional similarity that is unsurpassed by any other model in 12 of the 14 problems tested. This widespread superiority is rigorously demonstrated across numerous synthetic tests and real engineering problems, such as ship hull synthesis with hydrodynamic constraints and vehicle design with impact safety constraints. Our benchmarks showcase both the best-in-class performance of our new NDGM formulation and the overall dominance of NDGMs versus classic generative models. We publicly release the code and benchmarks at https://github.com/Lyleregenwetter/NDGMs.
Paper Structure (51 sections, 17 equations, 26 figures, 12 tables, 1 algorithm)

This paper contains 51 sections, 17 equations, 26 figures, 12 tables, 1 algorithm.

Figures (26)

  • Figure 1: Negative data helps generative models learn real-world data distributions, which often have gaps in their support caused by constraints. For example, by examining bike frames with disconnected components, a model can better learn to generate geometrically valid frames.
  • Figure 1: Study of invalidity rates for GAN-MDD trained with different numbers of positive datapoints ($N_p$) and negative datapoints ($N_n$). Note that GAN-MDD without negative data ($N_n=0$) trains as a vanilla GAN. Diversity loss is turned off. Scores are averaged over four instantiations. Lower is better. NDGMs can generate significantly fewer constraint-violating samples, even when trained on orders of magnitude less data.
  • Figure 2: Generated distributions from select generative models on Problem 1, a mixture of Gaussians with invalid region in the center of each mode. Positive data points and samples are shown in blue and negative ones in black. Our proposed NDGM model, GAN-MDD learns the distribution most faithfully.
  • Figure 3: Generated distributions from select generative models on Problem 2, a uniform distribution with many circular invalid regions. Positive data points and samples are shown in blue and negative ones in black. Our proposed NDGM model, GAN-MDD learns the distribution most faithfully.
  • Figure 4: Comparison of F1 scores ($\uparrow$) and invalidity rates ($\downarrow$) for benchmarked models on Problem 1 (left) and Problem 2 (right). Mean scores over six instantiations are plotted. Scores closer to the bottom left are more optimal. Triangular markers indicate that the score lies off the plot in the indicated direction. Class conditioning, classifier loss, and guidance are denoted with (CC), (CLF), and (Guided), respectively.
  • ...and 21 more figures