Semmeldetector: Application of Machine Learning in Commercial Bakeries

Thomas H. Schmitt; Maximilian Bundscherer; Tobias Bocklet

Semmeldetector: Application of Machine Learning in Commercial Bakeries

Thomas H. Schmitt, Maximilian Bundscherer, Tobias Bocklet

TL;DR

This work tackles the challenge of detecting and counting diverse baked goods in commercial bakeries with a limited dataset. It combines Copy-Paste data synthesis, SAM-based annotation, and grayscale imagery to train a YOLOv8 detector, achieving a strong $AP_{50}$ on a held-out test set (up to $89.1\%$) using $1280$ px input resolution. Ablation studies demonstrate the value of synthetic data and grayscale processing for robustness, while highlighting remaining confusions among morphologically similar items. The approach is demonstrated as an end-to-end pipeline, including an iOS deployment, with practical implications for reducing unsold product waste and improving bakery production planning. The results suggest that carefully engineered data augmentation and model scaling can overcome data scarcity in specialized industrial domains, enabling real-world optimization in the baking industry.

Abstract

The Semmeldetector, is a machine learning application that utilizes object detection models to detect, classify and count baked goods in images. Our application allows commercial bakers to track unsold baked goods, which allows them to optimize production and increase resource efficiency. We compiled a dataset comprising 1151 images that distinguishes between 18 different types of baked goods to train our detection models. To facilitate model training, we used a Copy-Paste augmentation pipeline to expand our dataset. We trained the state-of-the-art object detection model YOLOv8 on our detection task. We tested the impact of different training data, model scale, and online image augmentation pipelines on model performance. Our overall best performing model, achieved an AP@0.5 of 89.1% on our test set. Based on our results, we conclude that machine learning can be a valuable tool even for unforeseen industries like bakeries, even with very limited datasets.

Semmeldetector: Application of Machine Learning in Commercial Bakeries

TL;DR

on a held-out test set (up to

) using

px input resolution. Ablation studies demonstrate the value of synthetic data and grayscale processing for robustness, while highlighting remaining confusions among morphologically similar items. The approach is demonstrated as an end-to-end pipeline, including an iOS deployment, with practical implications for reducing unsold product waste and improving bakery production planning. The results suggest that carefully engineered data augmentation and model scaling can overcome data scarcity in specialized industrial domains, enabling real-world optimization in the baking industry.

Abstract

Paper Structure (26 sections, 4 figures, 5 tables)

This paper contains 26 sections, 4 figures, 5 tables.

Introduction
Related Work
Data
Training Set
Validation Set
Test Set
Image Annotation
Methods
Copy-Paste Augmentation
Online Image Augmentation
Tested Object Detection Models
Yolo
defDETR
Experiments and Results
Average Precision (AP)
...and 11 more sections

Figures (4)

Figure 1: Relative baked good distributions in our baseline training, validation and test set.
Figure 2: Training set images: (top left) image of a baked good, (top right) synthetic image featuring Sonnenblumensemmeln (sunflower bread buns) and Vollgutsemmeln (wholemeal bread buns), (middle left) synthetic image of baked goods, (middle right) synthetic image of baked goods on a beach, (bottom row) scaled and rotated synthetic baked good images.
Figure 3: Examples of images after applying our online augmentation $DO_{0.04}$.
Figure 4: Confusion Matrix of our best model's test set predictions, at minimum confidence and $IoU$ thresholds of $0.25$ and $0.45$, respectively.

Semmeldetector: Application of Machine Learning in Commercial Bakeries

TL;DR

Abstract

Semmeldetector: Application of Machine Learning in Commercial Bakeries

Authors

TL;DR

Abstract

Table of Contents

Figures (4)