A simple DNN regression for the chemical composition in essential oil
Yuki Harada, Shuichi Maeda, Masato Kiyama, Shinichiro Nakamura
TL;DR
The paper addresses predicting essential oil properties from chemical composition, a less-explored problem compared to single-molecule activity. It evaluates three architectures (CNN, GCNconv, GATconv) as DNN regressors using concatenated area percentages and molecular fingerprints as inputs, and tests two loss designs on a complete-graph representation with fingerprint-based edges. Despite the small dataset causing overfitting, certain configurations (GCNconv with BCEWithLogitsLoss and GATconv with NLL_loss) show promising predictive performance, demonstrated through cross-validation AUC metrics. The work demonstrates feasibility of DL models to infer essential oil properties from composition data and outlines future directions including sensory data integration and richer representation of plant taxonomy and compound information.
Abstract
Although experimental design and methodological surveys for mono-molecular activity/property has been extensively investigated, those for chemical composition have received little attention, with the exception of a few prior studies. In this study, we configured three simple DNN regressors to predict essential oil property based on chemical composition. Despite showing overfitting due to the small size of dataset, all models were trained effectively in this study.
