Table of Contents
Fetching ...

Neural topology optimization: the good, the bad, and the ugly

Suryanarayanan Manoj Sanu, Alejandro M. Aragon, Miguel A. Bessa

TL;DR

This work analyzes neural topology optimization (neural TO) by reparameterizing the design space with neural networks to guide topology optimization. It shows that NN-based reparameterization can reshape loss landscapes, sometimes enabling access to global optima in nontrivial problems but often introducing non-convexities that slow convergence, especially on convex tasks. By comparing MLP, SIREN, and CNN architectures across classic TO problems, the authors reveal that expressivity and dynamics of optimization are architecture- and problem-dependent, with CNNs often offering favorable trade-offs. The study highlights promising directions, including leveraging trained NNs as priors and exploiting ML hardware, while also cautions about hyperparameter sensitivity and scalability challenges in neural TO.

Abstract

Neural networks (NNs) hold great promise for advancing inverse design via topology optimization (TO), yet misconceptions about their application persist. This article focuses on neural topology optimization (neural TO), which leverages NNs to reparameterize the decision space and reshape the optimization landscape. While the method is still in its infancy, our analysis tools reveal critical insights into the NNs' impact on the optimization process. We demonstrate that the choice of NN architecture significantly influences the objective landscape and the optimizer's path to an optimum. Notably, NNs introduce non-convexities even in otherwise convex landscapes, potentially delaying convergence in convex problems but enhancing exploration for non-convex problems. This analysis lays the groundwork for future advancements by highlighting: 1) the potential of neural TO for non-convex problems and dedicated GPU hardware (the "good"), 2) the limitations in smooth landscapes (the "bad"), and 3) the complex challenge of selecting optimal NN architectures and hyperparameters for superior performance (the "ugly").

Neural topology optimization: the good, the bad, and the ugly

TL;DR

This work analyzes neural topology optimization (neural TO) by reparameterizing the design space with neural networks to guide topology optimization. It shows that NN-based reparameterization can reshape loss landscapes, sometimes enabling access to global optima in nontrivial problems but often introducing non-convexities that slow convergence, especially on convex tasks. By comparing MLP, SIREN, and CNN architectures across classic TO problems, the authors reveal that expressivity and dynamics of optimization are architecture- and problem-dependent, with CNNs often offering favorable trade-offs. The study highlights promising directions, including leveraging trained NNs as priors and exploiting ML hardware, while also cautions about hyperparameter sensitivity and scalability challenges in neural TO.

Abstract

Neural networks (NNs) hold great promise for advancing inverse design via topology optimization (TO), yet misconceptions about their application persist. This article focuses on neural topology optimization (neural TO), which leverages NNs to reparameterize the decision space and reshape the optimization landscape. While the method is still in its infancy, our analysis tools reveal critical insights into the NNs' impact on the optimization process. We demonstrate that the choice of NN architecture significantly influences the objective landscape and the optimizer's path to an optimum. Notably, NNs introduce non-convexities even in otherwise convex landscapes, potentially delaying convergence in convex problems but enhancing exploration for non-convex problems. This analysis lays the groundwork for future advancements by highlighting: 1) the potential of neural TO for non-convex problems and dedicated GPU hardware (the "good"), 2) the limitations in smooth landscapes (the "bad"), and 3) the complex challenge of selecting optimal NN architectures and hyperparameters for superior performance (the "ugly").
Paper Structure (21 sections, 17 equations, 21 figures, 3 tables, 1 algorithm)

This paper contains 21 sections, 17 equations, 21 figures, 3 tables, 1 algorithm.

Figures (21)

  • Figure 1: Schematic of neural topology optimization (TO). Unlike standard density-based TO (baseline), an NN outputs the physical densities $\boldsymbol{{\rho}}$ (within the bounds $[0, 1]$), on which finite element analysis is performed to obtain the objective. The network parameters are updated through an optimizer to indirectly alter the density field. For the baseline, the individual "pixels" are the decision variables, while for the network, trainable parameters form the decision space. Depending on the network architecture, the output can either be the complete density field or the density at a specific location in the design domain. In the latter case, the network represents a continuous field (see Sec. \ref{['sec:si_nn']} of SI for details about network architectures).
  • Figure 2: Two-bar problem with stress constraints optimized using MMA, both with and without neural reparameterization: (a) Schematic showing two bars subjected to an axial load at the middle node. The objective is to minimize total mass by varying the bar areas (decision variables $A_1$ and $A_2$), with constraints on each bar's maximum stress; (b) Original decision space, with the white area showing the feasible region of the design space (note the linear feasible subspace near $(1,0)$ along $A_2 = 0$), and the colored regions noting the constraint violations. Starting from a feasible point $\boldsymbol{\rho}_0 = \left( 1, 1\right)$, MMA converges to the local optimum $\boldsymbol{\rho}_l = \left(0, 1 \right)$. Also shown is the projected trajectory after reparameterization with a SIREN network, converging to the "singular" global minimum $\boldsymbol{\rho}_g = \left(1, 0 \right)$; (c) SIREN network used to reparameterize the problem, with a fixed input ($z_1=0.5$) and only 3 parameters ($\theta_i$). The hidden neuron has a parametric $\sin$ activation function, and outputs $A_1$ and $A_2$, where $\omega_0$ is a hyperparameter; (d) The neural decision space and the corresponding trajectory followed by the optimizer in this space. The decision space (left) consists of repeating units, and the inset (right) shows a zoomed view of the distorted constraint surfaces; (e) Three planes passing through the global optimum showing constraints, infeasible regions, and feasible paths to the solution. Details of the plots are given in Sec. \ref{['stress_si']} of SI.
  • Figure 3: The objective landscapes (interpolating between the same reference points) for different neural reparameterization methods compared against the baseline. The end point, at $\alpha=1$, is the decision space point corresponding to the baseline solution ($\boldsymbol{\hat{\theta}}^\star$) while the starting point is either uniform gray ($\boldsymbol{\hat{\theta}}_u$) or random values (denoted by multiple gray lines). Plots are shown for Michell boundary value problem for SIMP penalty $p=1$ (see Sec. \ref{['ssec:visualization']} of SI for more results). Constraint violations are indicated by colored markers, with the size of the markers proportional to the violation at each point.
  • Figure 4: Comparison of MMA's trajectory on the neural landscape against the conventional landscape for the Michell problem (with penalization $p=3$ and target volume fraction of 60%): (a) Compliance $c$ normalized by the baseline solution $c^\star$ (left ordinate) and volume fraction $V$ (right dashed ordinate), as functions of the optimization iteration; (b) Normalized compliance interpolated between the initial and optimized solutions. The size of the dots indicate the amount of constraint violation; (c) Best feasible designs obtained during optimization for each method, all having similar compliance; (d)$L_2$-norm of the objective gradient and the angle between successive gradient vectors at each point along the optimizer's trajectory.
  • Figure 5: (a) Peak signal-to-noise ratio (PSNR) values for NNs with different number of parameters, measured for 4 mesh resolutions (image sizes). A higher PSNR value indicates that the NN is able to accurately represent the design obtained from the baseline (taken as ground truth). Shaded region denote confidence intervals (one standard deviation) measured across several hyperparameters. The dashed vertical lines correspond to a network that has about $2000$ parameters, independent of mesh resolution. Cross markers indicate architectures with parameters matching each mesh resolution, distinguishing between over-parameterized and under-parameterized regimes. (b) Final designs for the Michell beam problem and their deviations from the baseline for a $320 \times 160$ resolution for all networks corresponding to cross markers.
  • ...and 16 more figures