SemEval-2025 Task 9: The Food Hazard Detection Challenge
Korbinian Randl, John Pavlopoulos, Aron Henriksson, Tony Lindgren, Juli Bakagianni
TL;DR
This paper presents SemEval 2025 Task 9, the Food Hazard Detection Challenge, which targets explainable text classification for food-incident reports across two subtasks: coarse-grained hazard/product category prediction (ST1) and fine-grained vector-level prediction (ST2). It compares encoder-only, encoder-decoder, and decoder-only transformers and demonstrates that large-language-model–generated synthetic data can effectively address long-tail distributions, with an overall emphasis on hazard accuracy via a macro $F_1$-based scoring scheme. The study analyzes participant systems (≈260 entrants, 99 submissions, 27 system descriptions), highlighting that richer input features, ensemble methods, and synthetic data contribute most to performance, while no single transformer architecture consistently dominates. The findings underscore the potential of synthetic data and model ensembles for real-world food-hazard information extraction, and they identify key avenues for future work, including vector-task difficulty, explainability, and robust debugging in imbalanced, domain-specific datasets.
Abstract
In this challenge, we explored text-based food hazard prediction with long tail distributed classes. The task was divided into two subtasks: (1) predicting whether a web text implies one of ten food-hazard categories and identifying the associated food category, and (2) providing a more fine-grained classification by assigning a specific label to both the hazard and the product. Our findings highlight that large language model-generated synthetic data can be highly effective for oversampling long-tail distributions. Furthermore, we find that fine-tuned encoder-only, encoder-decoder, and decoder-only systems achieve comparable maximum performance across both subtasks. During this challenge, we gradually released (under CC BY-NC-SA 4.0) a novel set of 6,644 manually labeled food-incident reports.
