Generative Chemical Language Models for Energetic Materials Discovery

Andrew Salij; R. Seaton Ullberg; Megan C. Davis; Marc J. Cawkwell; Christopher J. Snyder; Cristina Garcia Cardona; Ivana Matanovic; Wilton J. M. Kort-Kamp

Generative Chemical Language Models for Energetic Materials Discovery

Andrew Salij, R. Seaton Ullberg, Megan C. Davis, Marc J. Cawkwell, Christopher J. Snyder, Cristina Garcia Cardona, Ivana Matanovic, Wilton J. M. Kort-Kamp

Abstract

The discovery of new energetic materials remains a pressing challenge hindered by limited availability of high-quality data. To address this, we have developed generative molecular language models that have been pretrained on extensive chemical data and then fine-tuned with curated energetic materials datasets. This transfer-learning strategy extends the chemical language model capabilities beyond the pharmacological space in which they have been predominantly developed, offering a framework applicable to other data-spare discovery problems. Furthermore, we discuss the benefits of fragment-based molecular encodings for chemical language models, in particular in constructing synthetically accessible structures. Together, these advances provide a foundation for accelerating the design of next-generation energetic materials with demanding performance requirements.

Generative Chemical Language Models for Energetic Materials Discovery

Abstract

Generative Chemical Language Models for Energetic Materials Discovery

Abstract

Paper Structure

Table of Contents

Figures (18)