Table of Contents
Fetching ...

HydraMix: Multi-Image Feature Mixing for Small Data Image Classification

Christoph Reinders, Frederik Schubert, Bodo Rosenhahn

TL;DR

HydraMix tackles the data paucity barrier in image classification by learning to compose new images from multiple class-consistent exemplars. Through an encoder–decoder architecture with a segmentation-guided mixing module operating in feature space, and trained with reconstruction, perceptual, and adversarial losses, HydraMix generates diverse, high-quality samples without pretraining. It achieves state-of-the-art performance on ciFAIR-10, STL-10, and ciFAIR-100 in few-shot regimes, outperforms competing mixing methods, and introduces CLIP Synset Entropy to quantify augmentation diversity. The approach demonstrates strong practical impact for privacy-conscious, low-data applications and provides a foundation for robust cross-domain generalization and integration with automated augmentation.

Abstract

Training deep neural networks requires datasets with a large number of annotated examples. The collection and annotation of these datasets is not only extremely expensive but also faces legal and privacy problems. These factors are a significant limitation for many real-world applications. To address this, we introduce HydraMix, a novel architecture that generates new image compositions by mixing multiple different images from the same class. HydraMix learns the fusion of the content of various images guided by a segmentation-based mixing mask in feature space and is optimized via a combination of unsupervised and adversarial training. Our data augmentation scheme allows the creation of models trained from scratch on very small datasets. We conduct extensive experiments on ciFAIR-10, STL-10, and ciFAIR-100. Additionally, we introduce a novel text-image metric to assess the generality of the augmented datasets. Our results show that HydraMix outperforms existing state-of-the-art methods for image classification on small datasets.

HydraMix: Multi-Image Feature Mixing for Small Data Image Classification

TL;DR

HydraMix tackles the data paucity barrier in image classification by learning to compose new images from multiple class-consistent exemplars. Through an encoder–decoder architecture with a segmentation-guided mixing module operating in feature space, and trained with reconstruction, perceptual, and adversarial losses, HydraMix generates diverse, high-quality samples without pretraining. It achieves state-of-the-art performance on ciFAIR-10, STL-10, and ciFAIR-100 in few-shot regimes, outperforms competing mixing methods, and introduces CLIP Synset Entropy to quantify augmentation diversity. The approach demonstrates strong practical impact for privacy-conscious, low-data applications and provides a foundation for robust cross-domain generalization and integration with automated augmentation.

Abstract

Training deep neural networks requires datasets with a large number of annotated examples. The collection and annotation of these datasets is not only extremely expensive but also faces legal and privacy problems. These factors are a significant limitation for many real-world applications. To address this, we introduce HydraMix, a novel architecture that generates new image compositions by mixing multiple different images from the same class. HydraMix learns the fusion of the content of various images guided by a segmentation-based mixing mask in feature space and is optimized via a combination of unsupervised and adversarial training. Our data augmentation scheme allows the creation of models trained from scratch on very small datasets. We conduct extensive experiments on ciFAIR-10, STL-10, and ciFAIR-100. Additionally, we introduce a novel text-image metric to assess the generality of the augmented datasets. Our results show that HydraMix outperforms existing state-of-the-art methods for image classification on small datasets.
Paper Structure (28 sections, 9 equations, 9 figures, 10 tables)

This paper contains 28 sections, 9 equations, 9 figures, 10 tables.

Figures (9)

  • Figure 1: HydraMix introduces a novel feature-mixing architecture that combines the content of an arbitrary number of images in a feature space guided by a segmentation-based mixing mask. The model is optimized with reconstruction and adversarial losses. Afterward, HydraMix enables the generation of a large variety of new image compositions by sampling images and mixing masks.
  • Figure 2: Qualitative samples using the HydraMix method on three classes from the STL-10 dataset. For each class, four original images and their segmentation masks (left) and ten generated image compositions (right) are shown.
  • Figure 3: CLIP Synset Entropy ($\uparrow$) of the original dataset, a dataset generated with MixUp, and a dataset generated with HydraMix. By sampling new compositions, HydraMix is able to generate a larger variety of images that cover more synset concepts.
  • Figure 4: Average classification accuracy on the evaluated datasets with $5$ samples per class of HydraMix, ChimeraMix+Grid, and HydraMix and their respective ablation methods that do not use a generator.
  • Figure 5: Evaluation of the mixing probability $p_\text{gen}$ on different datasets. The default mixing ratio of $0.5$ is indicated by a dashed line. Experiments are performed with 5 examples per class.
  • ...and 4 more figures