Table of Contents
Fetching ...

A simple DNN regression for the chemical composition in essential oil

Yuki Harada, Shuichi Maeda, Masato Kiyama, Shinichiro Nakamura

TL;DR

The paper addresses predicting essential oil properties from chemical composition, a less-explored problem compared to single-molecule activity. It evaluates three architectures (CNN, GCNconv, GATconv) as DNN regressors using concatenated area percentages and molecular fingerprints as inputs, and tests two loss designs on a complete-graph representation with fingerprint-based edges. Despite the small dataset causing overfitting, certain configurations (GCNconv with BCEWithLogitsLoss and GATconv with NLL_loss) show promising predictive performance, demonstrated through cross-validation AUC metrics. The work demonstrates feasibility of DL models to infer essential oil properties from composition data and outlines future directions including sensory data integration and richer representation of plant taxonomy and compound information.

Abstract

Although experimental design and methodological surveys for mono-molecular activity/property has been extensively investigated, those for chemical composition have received little attention, with the exception of a few prior studies. In this study, we configured three simple DNN regressors to predict essential oil property based on chemical composition. Despite showing overfitting due to the small size of dataset, all models were trained effectively in this study.

A simple DNN regression for the chemical composition in essential oil

TL;DR

The paper addresses predicting essential oil properties from chemical composition, a less-explored problem compared to single-molecule activity. It evaluates three architectures (CNN, GCNconv, GATconv) as DNN regressors using concatenated area percentages and molecular fingerprints as inputs, and tests two loss designs on a complete-graph representation with fingerprint-based edges. Despite the small dataset causing overfitting, certain configurations (GCNconv with BCEWithLogitsLoss and GATconv with NLL_loss) show promising predictive performance, demonstrated through cross-validation AUC metrics. The work demonstrates feasibility of DL models to infer essential oil properties from composition data and outlines future directions including sensory data integration and richer representation of plant taxonomy and compound information.

Abstract

Although experimental design and methodological surveys for mono-molecular activity/property has been extensively investigated, those for chemical composition have received little attention, with the exception of a few prior studies. In this study, we configured three simple DNN regressors to predict essential oil property based on chemical composition. Despite showing overfitting due to the small size of dataset, all models were trained effectively in this study.

Paper Structure

This paper contains 12 sections, 4 figures, 2 tables.

Figures (4)

  • Figure 1: Data in website for the property table and analytical table
  • Figure 2: Data points of each 'Plant Tissue Name' in the property table
  • Figure 3: AUC history by three regressors with two loss designs; (i) CNN, (ii) GCNconv, and (iii) GATconv
  • Figure 4: ROC plot in the prediction by GATconv with NLL_loss (5th entry in Table 2)