Visual Car Brand Classification by Implementing a Synthetic Image Dataset Creation Pipeline
Jan Lippemeier, Stefanie Hittmeyer, Oliver Niehörster, Markus Lange-Hegermann
TL;DR
This work tackles data scarcity in car-brand classification from real traffic footage by proposing an automated pipeline that generates labeled synthetic images with Stable Diffusion and validates them through YOLOv8 bounding-box detection. The method uses German vehicle-registration data to construct a balanced label distribution, generates images in Text-to-Image and Image-to-Image modes, crops to single-car regions, and trains a ResNet-18 classifier in a transfer-learning setup. Key findings show that a synthetic-data-only approach can achieve up to 75% accuracy on real-world tests when using a large, diverse synthetic dataset and combining diffusion modes, albeit with biases toward common brands and notable domain gaps. The approach promises rapid dataset creation without manual labeling, offering practical benefits for data-scarce computer vision tasks while highlighting areas for improvement in bias mitigation and generalization.
Abstract
Recent advancements in machine learning, particularly in deep learning and object detection, have significantly improved performance in various tasks, including image classification and synthesis. However, challenges persist, particularly in acquiring labeled data that accurately represents specific use cases. In this work, we propose an automatic pipeline for generating synthetic image datasets using Stable Diffusion, an image synthesis model capable of producing highly realistic images. We leverage YOLOv8 for automatic bounding box detection and quality assessment of synthesized images. Our contributions include demonstrating the feasibility of training image classifiers solely on synthetic data, automating the image generation pipeline, and describing the computational requirements for our approach. We evaluate the usability of different modes of Stable Diffusion and achieve a classification accuracy of 75%.
