Thermodynamically consistent machine learning model for excess Gibbs energy

Marco Hoffmann; Thomas Specht; Quirin Göttl; Jakob Burger; Stephan Mandt; Hans Hasse; Fabian Jirasek

Thermodynamically consistent machine learning model for excess Gibbs energy

Marco Hoffmann, Thomas Specht, Quirin Göttl, Jakob Burger, Stephan Mandt, Hans Hasse, Fabian Jirasek

TL;DR

HANNA addresses the challenge of predicting the thermodynamic excess Gibbs energy $g^\mathrm{E}$ of multi-component liquid mixtures from molecular structure while enforcing thermodynamic constraints. It combines transformer-based molecular embeddings with a hard-constraint neural network and a geometric Muggianu projection to extend binary subsystem predictions to arbitrary numbers of components, and uses a differentiable surrogate solver to enable end-to-end training on LLE and HE data. The model is trained on extensive binary data sets (VLE TPXY, TPX, ACI, HE) from the Dortmund Data Bank and demonstrates superior accuracy relative to state-of-the-art UNIFAC-based models for binary and ternary systems, including ionic liquids, with broad applicability. The authors release the full model and code openly and provide an interactive interface to facilitate integration into process design and materials discovery.

Abstract

The excess Gibbs energy plays a central role in chemical engineering and chemistry, providing a basis for modeling thermodynamic properties of liquid mixtures. Predicting the excess Gibbs energy of multi-component mixtures solely from molecular structures is a long-standing challenge. We address this challenge with HANNA, a flexible machine learning model for excess Gibbs energy that integrates physical laws as hard constraints, guaranteeing thermodynamically consistent predictions. HANNA is trained on experimental data for vapor-liquid equilibria, liquid-liquid equilibria, activity coefficients at infinite dilution and excess enthalpies in binary mixtures. The end-to-end training on liquid-liquid equilibrium data is facilitated by a surrogate solver. A geometric projection method enables robust extrapolations to multi-component mixtures. We demonstrate that HANNA delivers accurate predictions, while providing a substantially broader domain of applicability than state-of-the-art benchmark methods. The trained model and corresponding code are openly available, and an interactive interface is provided on our website, MLPROP.

Thermodynamically consistent machine learning model for excess Gibbs energy

TL;DR

HANNA addresses the challenge of predicting the thermodynamic excess Gibbs energy

of multi-component liquid mixtures from molecular structure while enforcing thermodynamic constraints. It combines transformer-based molecular embeddings with a hard-constraint neural network and a geometric Muggianu projection to extend binary subsystem predictions to arbitrary numbers of components, and uses a differentiable surrogate solver to enable end-to-end training on LLE and HE data. The model is trained on extensive binary data sets (VLE TPXY, TPX, ACI, HE) from the Dortmund Data Bank and demonstrates superior accuracy relative to state-of-the-art UNIFAC-based models for binary and ternary systems, including ionic liquids, with broad applicability. The authors release the full model and code openly and provide an interactive interface to facilitate integration into process design and materials discovery.

Thermodynamically consistent machine learning model for excess Gibbs energy

TL;DR

Abstract

Thermodynamically consistent machine learning model for excess Gibbs energy

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (3)