Instance-Level Generation for Representation Learning

Yankun Wu; Zakaria Laskar; Giorgos Kordopatis-Zilos; Noa Garcia; Giorgos Tolias

Instance-Level Generation for Representation Learning

Yankun Wu, Zakaria Laskar, Giorgos Kordopatis-Zilos, Noa Garcia, Giorgos Tolias

TL;DR

This work tackles the data bottleneck in instance-level recognition by introducing ILGen, a fully synthetic pipeline that uses an LLM to generate object categories and a generative diffusion model to create diverse object instances, backgrounds, and viewpoints. By training a foundation vision encoder with a retrieval-oriented objective (recall@k) on CKN synthetic data, the method achieves cross-domain ILR improvements across seven benchmarks and demonstrates a new paradigm where only domain names are required as input. The results show synthetic data can outperform real-labeled data in multi-domain retrieval tasks, highlighting the practicality of synthetic ILR for rapid domain adaptation and wide applicability. The approach integrates LLMs, GDMs, and advanced background relighting to produce high-variance, instance-level training sets that improve universal representation learning for ILR.

Abstract

Instance-level recognition (ILR) focuses on identifying individual objects rather than broad categories, offering the highest granularity in image classification. However, this fine-grained nature makes creating large-scale annotated datasets challenging, limiting ILR's real-world applicability across domains. To overcome this, we introduce a novel approach that synthetically generates diverse object instances from multiple domains under varied conditions and backgrounds, forming a large-scale training set. Unlike prior work on automatic data synthesis, our method is the first to address ILR-specific challenges without relying on any real images. Fine-tuning foundation vision models on the generated data significantly improves retrieval performance across seven ILR benchmarks spanning multiple domains. Our approach offers a new, efficient, and effective alternative to extensive data collection and curation, introducing a new ILR paradigm where the only input is the names of the target domains, unlocking a wide range of real-world applications.

Instance-Level Generation for Representation Learning

TL;DR

Abstract

Instance-Level Generation for Representation Learning

TL;DR

Abstract

Paper Structure

Table of Contents

Figures (9)