Table of Contents
Fetching ...

Car-1000: A New Large Scale Fine-Grained Visual Categorization Dataset

Yutao Hu, Sen Li, Jincheng Yan, Wenqi Shao, Xiaoyan Luo

TL;DR

Car-1000 targets the challenging problem of fine-grained car model recognition by introducing a large-scale, up-to-date dataset comprising 1000 models from 165 automakers and a three-tier hierarchical label system. The authors build the dataset through popularity-based model selection from the DongCheDi forum, web image collection, and expert annotation, achieving 140,312 images after cleaning and filtering, with privacy-protecting license-plate masking. They benchmark 16 networks spanning general-purpose CNNs, transformers, and FGVC-specific methods, finding the dataset remains highly challenging (no model surpasses 90% accuracy) and identifying CAL and PMG as strong FGVC baselines, with HSD offering efficient performance. The public release of Car-1000 provides a robust resource and a new benchmark to drive FGVC research and development for automotive applications.

Abstract

Fine-grained visual categorization (FGVC) is a challenging but significant task in computer vision, which aims to recognize different sub-categories of birds, cars, airplanes, etc. Among them, recognizing models of different cars has significant application value in autonomous driving, traffic surveillance and scene understanding, which has received considerable attention in the past few years. However, Stanford-Car, the most widely used fine-grained dataset for car recognition, only has 196 different categories and only includes vehicle models produced earlier than 2013. Due to the rapid advancements in the automotive industry during recent years, the appearances of various car models have become increasingly intricate and sophisticated. Consequently, the previous Stanford-Car dataset fails to capture this evolving landscape and cannot satisfy the requirements of automotive industry. To address these challenges, in our paper, we introduce Car-1000, a large-scale dataset designed specifically for fine-grained visual categorization of diverse car models. Car-1000 encompasses vehicles from 165 different automakers, spanning a wide range of 1000 distinct car models. Additionally, we have reproduced several state-of-the-art FGVC methods on the Car-1000 dataset, establishing a new benchmark for research in this field. We hope that our work will offer a fresh perspective for future FGVC researchers. Our dataset is available at https://github.com/toggle1995/Car-1000.

Car-1000: A New Large Scale Fine-Grained Visual Categorization Dataset

TL;DR

Car-1000 targets the challenging problem of fine-grained car model recognition by introducing a large-scale, up-to-date dataset comprising 1000 models from 165 automakers and a three-tier hierarchical label system. The authors build the dataset through popularity-based model selection from the DongCheDi forum, web image collection, and expert annotation, achieving 140,312 images after cleaning and filtering, with privacy-protecting license-plate masking. They benchmark 16 networks spanning general-purpose CNNs, transformers, and FGVC-specific methods, finding the dataset remains highly challenging (no model surpasses 90% accuracy) and identifying CAL and PMG as strong FGVC baselines, with HSD offering efficient performance. The public release of Car-1000 provides a robust resource and a new benchmark to drive FGVC research and development for automotive applications.

Abstract

Fine-grained visual categorization (FGVC) is a challenging but significant task in computer vision, which aims to recognize different sub-categories of birds, cars, airplanes, etc. Among them, recognizing models of different cars has significant application value in autonomous driving, traffic surveillance and scene understanding, which has received considerable attention in the past few years. However, Stanford-Car, the most widely used fine-grained dataset for car recognition, only has 196 different categories and only includes vehicle models produced earlier than 2013. Due to the rapid advancements in the automotive industry during recent years, the appearances of various car models have become increasingly intricate and sophisticated. Consequently, the previous Stanford-Car dataset fails to capture this evolving landscape and cannot satisfy the requirements of automotive industry. To address these challenges, in our paper, we introduce Car-1000, a large-scale dataset designed specifically for fine-grained visual categorization of diverse car models. Car-1000 encompasses vehicles from 165 different automakers, spanning a wide range of 1000 distinct car models. Additionally, we have reproduced several state-of-the-art FGVC methods on the Car-1000 dataset, establishing a new benchmark for research in this field. We hope that our work will offer a fresh perspective for future FGVC researchers. Our dataset is available at https://github.com/toggle1995/Car-1000.

Paper Structure

This paper contains 8 sections, 2 figures, 2 tables.

Figures (2)

  • Figure 1: We depict the temporal coverage of car models involved in Stanford Car krause20133d, He et al,he2015recognition, Hsieh et al,hsieh2014symmetrical, CompCars Sur.yang2015large, BVVMR biglari2017part and our Car-1000. The higher the column is, the more models released in that years are included in corresponding dataset. It is obvious that Car-100 has a wide temporal coverage and contains the new models released in recent years.
  • Figure 2: We present some selected samples from our Car-1000 dataset, with the primary label and the name of the corresponding model provided below each image. Notably, the names before and after "_" indicate the automakers and the specific model, respectively.