Table of Contents
Fetching ...

Predicting the Temperature-Dependent CMC of Surfactant Mixtures with Graph Neural Networks

Christoforos Brozos, Jan G. Rittig, Elie Akanny, Sandip Bhattacharya, Christina Kohlmann, Alexander Mitsos

Abstract

Surfactants are key ingredients in foaming and cleansing products across various industries such as personal and home care, industrial cleaning, and more, with the critical micelle concentration (CMC) being of major interest. Predictive models for CMC of pure surfactants have been developed based on recent ML methods, however, in practice surfactant mixtures are typically used due to to performance, environmental, and cost reasons. This requires accounting for synergistic/antagonistic interactions between surfactants; however, predictive ML models for a wide spectrum of mixtures are missing so far. Herein, we develop a graph neural network (GNN) framework for surfactant mixtures to predict the temperature-dependent CMC. We collect data for 108 surfactant binary mixtures, to which we add data for pure species from our previous work [Brozos et al. (2024), J. Chem. Theory Comput.]. We then develop and train GNNs and evaluate their accuracy across different prediction test scenarios for binary mixtures relevant to practical applications. The final GNN models demonstrate very high predictive performance when interpolating between different mixture compositions and for new binary mixtures with known species. Extrapolation to binary surfactant mixtures where either one or both surfactant species are not seen before, yields accurate results for the majority of surfactant systems. We further find superior accuracy of the GNN over a semi-empirical model based on activity coefficients, which has been widely used to date. We then explore if GNN models trained solely on binary mixture and pure species data can also accurately predict the CMCs of ternary mixtures. Finally, we experimentally measure the CMC of 4 commercial surfactants that contain up to four species and industrial relevant mixtures and find a very good agreement between measured and predicted CMC values.

Predicting the Temperature-Dependent CMC of Surfactant Mixtures with Graph Neural Networks

Abstract

Surfactants are key ingredients in foaming and cleansing products across various industries such as personal and home care, industrial cleaning, and more, with the critical micelle concentration (CMC) being of major interest. Predictive models for CMC of pure surfactants have been developed based on recent ML methods, however, in practice surfactant mixtures are typically used due to to performance, environmental, and cost reasons. This requires accounting for synergistic/antagonistic interactions between surfactants; however, predictive ML models for a wide spectrum of mixtures are missing so far. Herein, we develop a graph neural network (GNN) framework for surfactant mixtures to predict the temperature-dependent CMC. We collect data for 108 surfactant binary mixtures, to which we add data for pure species from our previous work [Brozos et al. (2024), J. Chem. Theory Comput.]. We then develop and train GNNs and evaluate their accuracy across different prediction test scenarios for binary mixtures relevant to practical applications. The final GNN models demonstrate very high predictive performance when interpolating between different mixture compositions and for new binary mixtures with known species. Extrapolation to binary surfactant mixtures where either one or both surfactant species are not seen before, yields accurate results for the majority of surfactant systems. We further find superior accuracy of the GNN over a semi-empirical model based on activity coefficients, which has been widely used to date. We then explore if GNN models trained solely on binary mixture and pure species data can also accurately predict the CMCs of ternary mixtures. Finally, we experimentally measure the CMC of 4 commercial surfactants that contain up to four species and industrial relevant mixtures and find a very good agreement between measured and predicted CMC values.

Paper Structure

This paper contains 20 sections, 5 equations, 9 figures, 4 tables.

Figures (9)

  • Figure 1: Mixture network of the curated data set. Each node represents a surfactant structure, and each edge the existence of a binary mixture between two surfactants. Correspondingly, the surfactants are also categorized based on their class. Each node is plotted on a 2D map obtained by applying t-SNE Maaten2008VisualizingDU on generated molecular fingerprints (ECFP_10) Rogers2010. The surfactants are well distinguished in clusters based on their classes. An exception are the three n-alkyl-n-methylglucamides surfactants (Mega-8,-9 and -10) enclosed in the blue circle.
  • Figure 2: Schematic representation of the WS-GNN architecture for a binary mixture.
  • Figure 3: Schematic representation of the MG-GNN architecture for binary mixtures.
  • Figure 4: Surface tension measurement of D-AB30 at 23 $^\circ$C.
  • Figure 5: Parity plots on the 4 test sets. The predictions are made by the combined GNN model. The data points are highlighted with different colors and markers based on the classes of the two mixture species. The logarithm is applied to CMC in $\mu$M (base 10).
  • ...and 4 more figures