Table of Contents
Fetching ...

A topological analysis of the space of recipes

Emerson G. Escolar, Yuta Shimada, Masahiro Yuasa

TL;DR

This exploratory work introduces the use of topological data analysis, especially persistent homology, in order to study the space of culinary recipes, and proposes a method to generate novel ingredient combinations using combinatorial optimization on this topological information.

Abstract

In recent years, the use of data-driven methods has provided insights into underlying patterns and principles behind culinary recipes. In this exploratory work, we introduce the use of topological data analysis, especially persistent homology, in order to study the space of culinary recipes. In particular, persistent homology analysis provides a set of recipes surrounding the multiscale "holes" in the space of existing recipes. We then propose a method to generate novel ingredient combinations using combinatorial optimization on this topological information. We made biscuits using the novel ingredient combinations, which were confirmed to be acceptable enough by a sensory evaluation study. Our findings indicate that topological data analysis has the potential for providing new tools and insights in the study of culinary recipes.

A topological analysis of the space of recipes

TL;DR

This exploratory work introduces the use of topological data analysis, especially persistent homology, in order to study the space of culinary recipes, and proposes a method to generate novel ingredient combinations using combinatorial optimization on this topological information.

Abstract

In recent years, the use of data-driven methods has provided insights into underlying patterns and principles behind culinary recipes. In this exploratory work, we introduce the use of topological data analysis, especially persistent homology, in order to study the space of culinary recipes. In particular, persistent homology analysis provides a set of recipes surrounding the multiscale "holes" in the space of existing recipes. We then propose a method to generate novel ingredient combinations using combinatorial optimization on this topological information. We made biscuits using the novel ingredient combinations, which were confirmed to be acceptable enough by a sensory evaluation study. Our findings indicate that topological data analysis has the potential for providing new tools and insights in the study of culinary recipes.
Paper Structure (27 sections, 2 theorems, 15 equations, 10 figures, 4 tables)

This paper contains 27 sections, 2 theorems, 15 equations, 10 figures, 4 tables.

Key Result

Theorem 2.2

Let $(X,d_X)$ and $(Y,d_Y)$ be finite dissimilarity spaces. Then, where $d_{\text{GH}}$ is the Gromov-Hausdorff distance (see burago2001course).

Figures (10)

  • Figure 1: Histogram of cosine dissimilarities of distinct pairs of distinct recipes from the recipe data ahn2011flavor, restricted to dissimilarities less than $1$. There were 482,978,610 (out of 1,199,642,653) pairs with dissimilarity exactly equal to $1$. These are the pairs of recipes that share no ingredients at all.
  • Figure 2: Illustrations for the Vietoris-Rips complex in Example \ref{['example:vrcomplex']}. \ref{['subfig:exampleVRdists']}, cosine dissimilarities between pairs of points. \ref{['subfig:exampleVRt1']}, Vietoris-Rips complex $V_{t_1}(X)$ with threshold $t_1 = 1-\frac{1}{\sqrt{2}}$. \ref{['subfig:exampleVRt2']}, Vietoris-Rips complex $V_{t_2}(X)$ with threshold $t_2 = 0.5$.
  • Figure 3: Result of the persistent homology analysis of the recipe data from ahn2011flavor. \ref{['subfig:pd1']}, $1$st degree persistence diagram. The birth-death pairs with the top nine longest lifespans (lifespan $>0.27$) are starred. Out of all the birth-death pairs, $5\%$ ($5,272$ birth-death pairs) with the longest lifespans ("the top $5\%$ birth-death pairs"), are shown in red. \ref{['subfig:pd1histo']}, Histogram (with frequency in log-scale) of lifespans for the persistence diagram. The dissimilarity $0.27$ is marked with the dashed line. The dotted red line is at the smallest lifespan ($\approx 0.104$) of the top $5\%$ birth-death pairs.
  • Figure 4: Basic information about the representative cycle of the birth-death pair with the longest lifespan. \ref{['subfig:partiallistofrecipes']}, partial list of the existing recipes (showing only $20$ out of a total of $97$ recipes) in a representative cycle for the birth-death pair with the longest lifespan. \ref{['subfig:regionalitytop']}, regions associated to the $97$ recipes appearing in the representative cycle of the birth-death pair with the longest lifespan.
  • Figure 5: Regionality of the representative cycles of the top $5\%$ birth-death pairs ($5,272$ cycles). \ref{['subfig:regionconcentration']}, histogram of number of distinct regions involved in the recipes of representative cycles of the top $5\%$ birth-death pairs. \ref{['subfig:regionalityaggregate']}, relative frequencies of the regions associated to the recipes in the representative cycles of the top $5\%$ birth-death pairs, versus in those in the original data.
  • ...and 5 more figures

Theorems & Definitions (3)

  • Example 2.1
  • Theorem 2.2: chazal2014persistence, also stated in turner2019rips
  • Theorem A.1