Multi-Objective Evolutionary Design of Molecules with Enhanced Nonlinear Optical Properties

Dominic Mashak; Jacob Schrum; S. A. Alexander

Multi-Objective Evolutionary Design of Molecules with Enhanced Nonlinear Optical Properties

Dominic Mashak, Jacob Schrum, S. A. Alexander

Abstract

Nonlinear optical (NLO) materials are essential for many photonic, telecommunication, and laser technologies, yet discovering better NLO molecules is computationally challenging due to the vast chemical space and competing objectives. We compare evolutionary algorithms for molecular design, targeting four objectives: maximizing the ratio of first-to-second hyperpolarizability $(β/γ)$, optimizing HOMO-LUMO gap and linear polarizability to target ranges, and minimizing energy per atom. We encode molecules as SMILES strings and evaluate their properties using quantum-chemical calculations. We compare NSGA-II, MAP-Elites, MOME, a single-objective $(μ+λ)$ evolutionary algorithm, and simulated annealing. Quality diversity methods maintain archives across a measure space defined by atom and bond count, enabling the discovery of structurally diverse molecules. Our results demonstrate that NSGA-II consistently earns high scores in every objective, leading to high-quality molecules, but MOME does a better job exploring a wide range of possibilities, resulting in higher global hypervolume and MOQD scores. However, each method has strengths and weaknesses, and produced many promising molecules.

Multi-Objective Evolutionary Design of Molecules with Enhanced Nonlinear Optical Properties

Abstract

, optimizing HOMO-LUMO gap and linear polarizability to target ranges, and minimizing energy per atom. We encode molecules as SMILES strings and evaluate their properties using quantum-chemical calculations. We compare NSGA-II, MAP-Elites, MOME, a single-objective

evolutionary algorithm, and simulated annealing. Quality diversity methods maintain archives across a measure space defined by atom and bond count, enabling the discovery of structurally diverse molecules. Our results demonstrate that NSGA-II consistently earns high scores in every objective, leading to high-quality molecules, but MOME does a better job exploring a wide range of possibilities, resulting in higher global hypervolume and MOQD scores. However, each method has strengths and weaknesses, and produced many promising molecules.

Paper Structure (24 sections, 2 equations, 6 figures)

This paper contains 24 sections, 2 equations, 6 figures.

Introduction
Related Work
Defining an Effective Electro-Optic Modulator
Methods
SMILES String Encoding
Calculating Chemical Properties
Multiobjective Optimization
Quality Diversity
Multiobjective Quality Diversity
Experiment
Algorithms
Objectives
Diversity Measures and Archives
SMILES String Mutations
Algorithmic Settings
...and 9 more sections

Figures (6)

Figure 1: Median Best Objective Scores Across 20 Runs of Each Algorithm: (\ref{['fig:beta_gamma']}) Median first-to-second hyperpolarizability ratio. High values are better, so $(\mu+\lambda)$ outperforms all others by a large margin, including other single-objective methods, though NSGA-II is clearly second-best. (\ref{['fig:lin_pol']}) Median linear polarizability range deviation. All but simulated annealing and $(\mu+\lambda)$ quickly reach a perfect minimal score of 0, though $(\mu+\lambda)$ at least gets close. (\ref{['fig:homo_lumo']}) Median $f_{\Delta E}$ range deviation. NSGA-II and simulated annealing tie for best with perfect minimal scores of 0, which the other algorithms do not reach. The single-objective methods were not aware of this objective, so their poorer performance is not surprising, but MOME's performance is slightly disappointing. However, these are only median scores; some MOME runs reach the perfect score, but less than half. (\ref{['fig:energy_per_atom']}) Median energy per atom. NSGA-II is clearly the best at minimizing this objective, with most other methods clustering closer together, including the single-objective methods that were unaware of this objective. Only simulated annealing is exceptionally poor.
Figure 2: Global Hypervolume Scores Across 20 Runs of Each Algorithm: (\ref{['fig:global_hv_eval']}) Median hypervolume scores for each algorithm across function evaluations. $\text{MOME}_{F}$ is the best, followed by a cluster of $(\mu+\lambda)$, NSGA-II, and $\text{MAP-Elites}_{C}$, before algorithms start to bunch together near the bottom. (\ref{['fig:global_hv_final']}) Box-and-whisker plots of hypervolume scores for final Pareto fronts. The lower quartile, median, and upper quartile are the lower boundary, center line, and upper boundary of each box respectively, and the whiskers denote the furthest points within $1.5IQR$ of the nearest quartile, where $IQR$ is the interquartile range. Points outside of the whiskers are outliers, of which there are many. However, $\text{MOME}_{F}$ and $(\mu+\lambda)$ both have high upper quartiles, and spread more across the range of higher scores.
Figure 3: Fine-grained and Coarse Archive Median Scores Across 20 Runs of Each Algorithm: (\ref{['fig:count_f']}) Median bin count with fine-grained binning. Unsurprisingly, QD methods fill more bins than non-QD approaches, and simulated annealing performs the worst. Fine-grained QD methods occupy slightly more bins than their coarse counterparts. (\ref{['fig:count_c']}) Median bin count with coarse binning. The scale is different with a coarse archive, but results are qualitatively identical to the fine-grained archive results. Interestingly, fine-grained QD methods still occupy more bins than their coarse counterparts. (\ref{['fig:moqd_f']}) Median MOQD with fine-grained binning. $\text{MOME}_{F}$ is clearly superior, followed distantly by $\text{MAP-Elites}_{C}$, then NSGA-II, before the rest cluster more tightly together. (\ref{['fig:moqd_c']}) Median MOQD with coarse binning. Qualitatively similar to the fine-grained MOQD results, except that $(\mu+\lambda)$ demonstrates a significant jump in MOQD score near the end of evolution that ties it with NSGA-II.
Figure 4: Fine-grained Archive Median QD Scores by Objective Across 20 Runs of Each Algorithm: (\ref{['fig:beta_gamma_fine_qd']}) Median QD for first-to-second hyperpolarizability ratio using fine-grained binning. Qualitatively similar to raw objective scores for $\beta/\gamma$ (Figure \ref{['fig:beta_gamma_fine_qd']}), with strong performance by $(\mu+\lambda)$ and NSGA-II in second place. (\ref{['fig:lin_pol_fine_qd']}) Median QD for linear polarizability range deviation using fine-grained binning. Both MOME approaches perform the best, with MAP-Elites approaches beneath them, and NSGA-II trailing close behind, beating $(\mu+\lambda)$ and simulated annealing. (\ref{['fig:homo_lumo_fine_qd']}) Median QD for HOMO-LUMO gap range deviation using fine-grained binning. Qualitatively similar to the $(f_{\alpha})_{F}$ results. (\ref{['fig:energy_per_atom_fine_qd']}) Median QD for energy per atom using fine-grained binning. The MOQD and QD methods are more tightly clustered with NSGA-II near the top, but $(\mu+\lambda)$ is still far behind and simulated annealing is far below that.
Figure 5: Fine-grained Mega Archive Hypervolume Heatmaps: Fine-grained archives that combine solutions from each algorithm across all 20 seeds, with heat scale showing each bin's HV score. The x-axis is the atom count and the y-axis is the bond count.
...and 1 more figures

Multi-Objective Evolutionary Design of Molecules with Enhanced Nonlinear Optical Properties

Abstract

Multi-Objective Evolutionary Design of Molecules with Enhanced Nonlinear Optical Properties

Authors

Abstract

Table of Contents

Figures (6)