Olfactory Label Prediction on Aroma-Chemical Pairs
Laura Sisson, Aryan Amit Barsainyan, Mrityunjay Sharma, Ritesh Kumar
TL;DR
This work addresses the challenge of predicting olfactory descriptors for blends of aroma-chemicals using graph neural networks. It introduces a labeled blended-pair dataset and compares two architectures, a Graph Isomorphism Network (GIN-GNN) and a Message Passing Neural Network (MPNN-GNN), including a graph-carving strategy to ensure robust train/test separation. The MPNN-GNN achieves a mean AUROC of about $0.77$ on blended pairs and $0.89$ on single-molecule prediction, with the GIN-GNN close behind on blends, indicating strong transferability between blend and single-molecule tasks. Embedding-space analyses reveal non-linear blending and selective contributions from constituent molecules, and the authors provide public code to encourage further exploration and data augmentation in this domain.
Abstract
The application of deep learning techniques on aroma-chemicals has resulted in models more accurate than human experts at predicting olfactory qualities. However, public research in this domain has been limited to predicting the qualities of single molecules, whereas in industry applications, perfumers and food scientists are often concerned with blends of many molecules. In this paper, we apply both existing and novel approaches to a dataset we gathered consisting of labeled pairs of molecules. We present graph neural network models capable of accurately predicting the odor qualities arising from blends of aroma-chemicals, with an analysis of how variations in architecture can lead to significant differences in predictive power.
