Unveiling the Potential of AI for Nanomaterial Morphology Prediction
Ivan Dubrovsky, Andrei Dmitrenko, Aleksei Dmitrenko, Nikita Serov, Vladimir Vinogradov
TL;DR
The paper addresses the challenge of predicting nanomaterial morphology under realistic data constraints by building a large multimodal calcium carbonate dataset and evaluating classical ML, large language models, and a text-to-image pipeline. It demonstrates that tree-based ensembles achieve strong shape accuracy (≈0.80) and that GPT-4 and related LLMs can match or exceed these results in few-shot settings, offering a language-driven route to morphology prediction. A three-component text-to-image prototype further explores mapping synthesis descriptions to nanoparticle images, showing potential for in silico morphology visualization albeit limited by data; the work also exposes substantial needs for larger, standardized datasets and improved SEM-data handling. Overall, the study provides a rigorous benchmark and showcases promising AI approaches for nanomaterial design, potentially reducing experimental overhead and accelerating discovery. The findings emphasize that combining statistical feature analyses, classical ML, and LLM-driven prompts can yield robust morphology predictions with practical implications for materials science workflows.
Abstract
Creation of nanomaterials with specific morphology remains a complex experimental process, even though there is a growing demand for these materials in various industry sectors. This study explores the potential of AI to predict the morphology of nanoparticles within the data availability constraints. For that, we first generated a new multi-modal dataset that is double the size of analogous studies. Then, we systematically evaluated performance of classical machine learning and large language models in prediction of nanomaterial shapes and sizes. Finally, we prototyped a text-to-image system, discussed the obtained empirical results, as well as the limitations and promises of existing approaches.
