Table of Contents
Fetching ...

FEAST: Retrieval-Augmented Multi-Hierarchical Food Classification for the FoodEx2 System

Lorenzo Molfetta, Alessio Cocchieri, Stefano Fantazzini, Giacomo Frisoni, Luca Ragazzi, Gianluca Moro

TL;DR

FEAST (Food Embedding And Semantic Taxonomy), a novel retrieval-augmented framework that decomposes FoodEx2 classification into a three-stage approach that leverages the system's hierarchical structure to guide training and performing deep metric learning, learns discriminative embeddings that mitigate data sparsity and improve generalization on rare and fine-grained labels.

Abstract

Hierarchical text classification (HTC) and extreme multi-label classification (XML) tasks face compounded challenges from complex label interdependencies, data sparsity, and extreme output dimensions. These challenges are exemplified in the European Food Safety Authority's FoodEx2 system-a standardized food classification framework essential for food consumption monitoring and contaminant exposure assessment across Europe. FoodEx2 coding transforms natural language food descriptions into a set of codes from multiple standardized hierarchies, but faces implementation barriers due to its complex structure. Given a food description (e.g., "organic yogurt''), the system identifies its base term ("yogurt''), all the applicable facet categories (e.g., "production method''), and then, every relevant facet descriptors to each category (e.g., "organic production''). While existing models perform adequately on well-balanced and semantically dense hierarchies, no work has been applied on the practical constraints imposed by the FoodEx2 system. The limited literature addressing such real-world scenarios further compounds these challenges. We propose FEAST (Food Embedding And Semantic Taxonomy), a novel retrieval-augmented framework that decomposes FoodEx2 classification into a three-stage approach: (1) base term identification, (2) multi-label facet prediction, and (3) facet descriptor assignment. By leveraging the system's hierarchical structure to guide training and performing deep metric learning, FEASTlearns discriminative embeddings that mitigate data sparsity and improve generalization on rare and fine-grained labels. Evaluated on the multilingual FoodEx2 benchmark, FEAST outperforms the prior European's CNN baseline F1 scores by 12-38 % on rare classes.

FEAST: Retrieval-Augmented Multi-Hierarchical Food Classification for the FoodEx2 System

TL;DR

FEAST (Food Embedding And Semantic Taxonomy), a novel retrieval-augmented framework that decomposes FoodEx2 classification into a three-stage approach that leverages the system's hierarchical structure to guide training and performing deep metric learning, learns discriminative embeddings that mitigate data sparsity and improve generalization on rare and fine-grained labels.

Abstract

Hierarchical text classification (HTC) and extreme multi-label classification (XML) tasks face compounded challenges from complex label interdependencies, data sparsity, and extreme output dimensions. These challenges are exemplified in the European Food Safety Authority's FoodEx2 system-a standardized food classification framework essential for food consumption monitoring and contaminant exposure assessment across Europe. FoodEx2 coding transforms natural language food descriptions into a set of codes from multiple standardized hierarchies, but faces implementation barriers due to its complex structure. Given a food description (e.g., "organic yogurt''), the system identifies its base term ("yogurt''), all the applicable facet categories (e.g., "production method''), and then, every relevant facet descriptors to each category (e.g., "organic production''). While existing models perform adequately on well-balanced and semantically dense hierarchies, no work has been applied on the practical constraints imposed by the FoodEx2 system. The limited literature addressing such real-world scenarios further compounds these challenges. We propose FEAST (Food Embedding And Semantic Taxonomy), a novel retrieval-augmented framework that decomposes FoodEx2 classification into a three-stage approach: (1) base term identification, (2) multi-label facet prediction, and (3) facet descriptor assignment. By leveraging the system's hierarchical structure to guide training and performing deep metric learning, FEASTlearns discriminative embeddings that mitigate data sparsity and improve generalization on rare and fine-grained labels. Evaluated on the multilingual FoodEx2 benchmark, FEAST outperforms the prior European's CNN baseline F1 scores by 12-38 % on rare classes.
Paper Structure (52 sections, 13 equations, 4 figures, 6 tables)

This paper contains 52 sections, 13 equations, 4 figures, 6 tables.

Figures (4)

  • Figure 1: Illustration of the FoodEx2 coding system. An input food item is represented by a base term and facet descriptors detailing its key attributes, such as ingredients, processing, and packaging.
  • Figure 2: Overview of the proposed FoodEx2 food classification pipeline, which operates in three main stages: (1) Base Term Classification, where top-$K$ candidate base terms are retrieved and reranked to produce a top-1 base term; (2) Facet Category Classification, which identifies relevant facet categories using either an LLM, a multi-label classifier, or a bi-encoder approach; and (3) Facet Descriptor Classification, which retrieves and reranks descriptor candidates for the previously selected facet categories. The system combines encoder-based retrieval with flexible reranking strategies (LLM instruct or cross-encoder) to map free-text food descriptions into structured FoodEx2 codes. $\oplus$ denotes concatenation and $f_p(\cdot)$ represents a prompt template function.
  • Figure 3: Prompt templates used for LLM's multitask instruction tuning across base term, category, and descriptor classification tasks.
  • Figure 4: Comparison of F1-scores between meta2024llama31 and DBLP:conf/iclr/HeLGC21 across different facet categories (Task II).