Table of Contents
Fetching ...

ICPR 2024 Competition on Domain Adaptation and GEneralization for Character Classification (DAGECC)

Sofia Marino, Jennifer Vandoni, Emanuel Aldea, Ichraq Lemghari, Sylvie Le Hégarat-Mascle, Frédéric Jurie

TL;DR

The paper addresses the challenge of robust character recognition under domain shift by introducing the DAGECC competition and the Safran-MNIST dataset suite, designed for Domain Generalization and Unsupervised Domain Adaptation. It provides two tasks: DG without target data and UDA with unlabeled target data, using a Macro-averaged F1-score to handle class imbalance. Top entries repeatedly rely on pretrained CNN backbones (e.g., ResNet50, GoogLeNet) and synthetic data generation to approximate target-domain distributions, achieving notable gains (e.g., Macro F1 up to $0.822$ for Task 1 and $0.652$ for Task 2). The dataset and competition framework aim to enable fast prototyping, reproducibility, and practical impact in industrial serial-number recognition and related domains, with public release and community involvement encouraged through Codabench. Specifically, the lightweight Safran-MNIST data and emphasis on domain adaptation/generalization provide a pragmatic benchmark for research and development in real-world OCR tasks.

Abstract

In this companion paper for the DAGECC (Domain Adaptation and GEneralization for Character Classification) competition organized within the frame of the ICPR 2024 conference, we present the general context of the tasks we proposed to the community, we introduce the data that were prepared for the competition and we provide a summary of the results along with a description of the top three winning entries. The competition was centered around domain adaptation and generalization, and our core aim is to foster interest and facilitate advancement on these topics by providing a high-quality, lightweight, real world dataset able to support fast prototyping and validation of novel ideas.

ICPR 2024 Competition on Domain Adaptation and GEneralization for Character Classification (DAGECC)

TL;DR

The paper addresses the challenge of robust character recognition under domain shift by introducing the DAGECC competition and the Safran-MNIST dataset suite, designed for Domain Generalization and Unsupervised Domain Adaptation. It provides two tasks: DG without target data and UDA with unlabeled target data, using a Macro-averaged F1-score to handle class imbalance. Top entries repeatedly rely on pretrained CNN backbones (e.g., ResNet50, GoogLeNet) and synthetic data generation to approximate target-domain distributions, achieving notable gains (e.g., Macro F1 up to for Task 1 and for Task 2). The dataset and competition framework aim to enable fast prototyping, reproducibility, and practical impact in industrial serial-number recognition and related domains, with public release and community involvement encouraged through Codabench. Specifically, the lightweight Safran-MNIST data and emphasis on domain adaptation/generalization provide a pragmatic benchmark for research and development in real-world OCR tasks.

Abstract

In this companion paper for the DAGECC (Domain Adaptation and GEneralization for Character Classification) competition organized within the frame of the ICPR 2024 conference, we present the general context of the tasks we proposed to the community, we introduce the data that were prepared for the competition and we provide a summary of the results along with a description of the top three winning entries. The competition was centered around domain adaptation and generalization, and our core aim is to foster interest and facilitate advancement on these topics by providing a high-quality, lightweight, real world dataset able to support fast prototyping and validation of novel ideas.

Paper Structure

This paper contains 13 sections, 2 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Distribution of class samples in Safran-MNIST-D by sets.
  • Figure 2: Distribution of class samples in Safran-MNIST-DLS by sets.
  • Figure 3: Images extracted from Safran-MNIST dataset suite.
  • Figure 4: Breakdown of competition participation.