A neural recommender system leveraging transfer learning for property prediction of ionic liquids
Sahil Sethi, Kai Sundmacher, Caroline Ganzer
TL;DR
This work tackles the challenge of predicting key thermophysical properties for ionic liquids under sparse experimental data by deploying a two-stage transfer-learning pipeline. A neural recommender system is pre-trained on COSMO-RS simulated data to learn property-specific structural embeddings for cations and anions, then fine-tuned with limited experimental data to model temperature and pressure effects and enable cross-property transfer across five properties: density, viscosity, surface tension, heat capacity, and melting point. The approach yields substantial improvements for four properties ($R^2$, $MAE$, and $MAPE$ metrics) and enables scalable predictions across over 700,000 ILs, demonstrating robust extrapolation capabilities despite data sparsity. The framework reduces data requirements and provides a practical tool for IL screening in process design, though melting point prediction benefits less from transfer learning, indicating opportunities for future enhancements and extension to additional properties.
Abstract
Ionic liquids (ILs) have emerged as versatile replacements for traditional solvents because their physicochemical properties can be precisely tailored to various applications. However, accurately predicting key thermophysical properties remains challenging due to the vast chemical design space and the limited availability of experimental data. In this study, we present a data-driven transfer learning framework combined with a neural recommender system (NRS) to enable reliable property prediction for ILs using sparse experimental datasets. The approach involves a two-stage process: first, pre-training NRS models on COSMO-RS-based simulated data at fixed temperature and pressure, and second, fine-tuning simple feedforward neural networks with experimental data at varying temperatures and pressures. In this work, five essential IL properties are considered: density, viscosity, surface tension, heat capacity, and melting point. We find that the framework supports both within-property and cross-property knowledge transfer. Notably, pre-trained models for density, viscosity, and heat capacity are used to fine-tune models for all five target properties, achieving improved performance by a substantial margin for four of them. The model exhibits robust extrapolation to previously unseen ILs. Moreover, the final trained models enable property prediction for over 700,000 IL combinations, offering a scalable solution for IL screening in process design. This work highlights the effectiveness of combining simulated data and transfer learning to overcome sparsity in the experimental data.
