Table of Contents
Fetching ...

FlavorDiffusion: Predicting Food Pairings and Chemical Interactions Using Diffusion Models

Seo Jun Pyo

TL;DR

FlavorDiffusion tackles the challenge of predicting food pairings and ingredient-chemical interactions without chromatography by proposing a graph-based diffusion model on a heterogeneous food-chemical network. It combines subgraph sampling, a forward Gaussian diffusion over edge scores, and a node-conditioned reverse denoising process implemented with an anisotropic GNN, augmented by a Chemical Structure Prediction (CSP) layer. The approach achieves improved NMI-based clustering and robust generalization across subgraph sizes, with CSP providing the strongest gains and enabling meaningful discovery of novel ingredient combinations. This framework offers a scalable, interpretable means to align culinary and chemical properties, enabling chemistries-aware flavor design and computational gastronomy research.

Abstract

The study of food pairing has evolved beyond subjective expertise with the advent of machine learning. This paper presents FlavorDiffusion, a novel framework leveraging diffusion models to predict food-chemical interactions and ingredient pairings without relying on chromatography. By integrating graph-based embeddings, diffusion processes, and chemical property encoding, FlavorDiffusion addresses data imbalances and enhances clustering quality. Using a heterogeneous graph derived from datasets like Recipe1M and FlavorDB, our model demonstrates superior performance in reconstructing ingredient-ingredient relationships. The addition of a Chemical Structure Prediction (CSP) layer further refines the embedding space, achieving state-of-the-art NMI scores and enabling meaningful discovery of novel ingredient combinations. The proposed framework represents a significant step forward in computational gastronomy, offering scalable, interpretable, and chemically informed solutions for food science.

FlavorDiffusion: Predicting Food Pairings and Chemical Interactions Using Diffusion Models

TL;DR

FlavorDiffusion tackles the challenge of predicting food pairings and ingredient-chemical interactions without chromatography by proposing a graph-based diffusion model on a heterogeneous food-chemical network. It combines subgraph sampling, a forward Gaussian diffusion over edge scores, and a node-conditioned reverse denoising process implemented with an anisotropic GNN, augmented by a Chemical Structure Prediction (CSP) layer. The approach achieves improved NMI-based clustering and robust generalization across subgraph sizes, with CSP providing the strongest gains and enabling meaningful discovery of novel ingredient combinations. This framework offers a scalable, interpretable means to align culinary and chemical properties, enabling chemistries-aware flavor design and computational gastronomy research.

Abstract

The study of food pairing has evolved beyond subjective expertise with the advent of machine learning. This paper presents FlavorDiffusion, a novel framework leveraging diffusion models to predict food-chemical interactions and ingredient pairings without relying on chromatography. By integrating graph-based embeddings, diffusion processes, and chemical property encoding, FlavorDiffusion addresses data imbalances and enhances clustering quality. Using a heterogeneous graph derived from datasets like Recipe1M and FlavorDB, our model demonstrates superior performance in reconstructing ingredient-ingredient relationships. The addition of a Chemical Structure Prediction (CSP) layer further refines the embedding space, achieving state-of-the-art NMI scores and enabling meaningful discovery of novel ingredient combinations. The proposed framework represents a significant step forward in computational gastronomy, offering scalable, interpretable, and chemically informed solutions for food science.

Paper Structure

This paper contains 31 sections, 20 equations, 2 figures, 4 tables.

Figures (2)

  • Figure 1: Embedding space comparison under different configurations. (Left) Baseline embeddings show poor separation between ingredients and compounds. (Center) Flavor Diffusion (200 nodes) without CSP achieves improved clustering of chemical compounds and hub ingredients. (Right) Flavor Diffusion (200 nodes) with CSP results in well-defined clusters, leveraging chemical fingerprints to enhance separation.
  • Figure 2: Progression of edge scores over diffusion steps for a 25-node subgraph. The color intensity represents edge scores normalized between 0 and 1. The reconstructed graph increasingly aligns with the ground truth structure.