Table of Contents
Fetching ...

A Benchmark Grocery Dataset of Realworld Point Clouds From Single View

Shivanand Venkanna Sheshappanavar, Tejas Anvekar, Shivanand Kundargi, Yufan Wang, Chandra Kambhamettu

TL;DR

3DGrocery100 introduces the largest real-world 3D grocery benchmark built from RGB-D single-view captures, yielding 87,898 point clouds across 100 fine-grained classes. The dataset enables rigorous evaluation of 3D point-cloud classifiers, few-shot generalization via the 63-class subset 3DGrocery63, and class-incremental learning with a LWF-based baseline, across color and no-color variants. Across multiple architectures (PointNet, PointNet++, DGCNN, PCT, PointMLP, PointNeXt), the work demonstrates strong color-based discriminability and highlights challenges in cross-domain and continual learning scenarios, providing valuable baselines and insights for real-world grocery recognition. The work also offers a pre-training subset (Packages) and extensive supplementary materials to support researchers in 3D grocery perception, with project resources at the provided page, underscoring its practical impact for automatic checkout, robotic navigation, and assistive technologies.

Abstract

Fine-grained grocery object recognition is an important computer vision problem with broad applications in automatic checkout, in-store robotic navigation, and assistive technologies for the visually impaired. Existing datasets on groceries are mainly 2D images. Models trained on these datasets are limited to learning features from the regular 2D grids. While portable 3D sensors such as Kinect were commonly available for mobile phones, sensors such as LiDAR and TrueDepth, have recently been integrated into mobile phones. Despite the availability of mobile 3D sensors, there are currently no dedicated real-world large-scale benchmark 3D datasets for grocery. In addition, existing 3D datasets lack fine-grained grocery categories and have limited training samples. Furthermore, collecting data by going around the object versus the traditional photo capture makes data collection cumbersome. Thus, we introduce a large-scale grocery dataset called 3DGrocery100. It constitutes 100 classes, with a total of 87,898 3D point clouds created from 10,755 RGB-D single-view images. We benchmark our dataset on six recent state-of-the-art 3D point cloud classification models. Additionally, we also benchmark the dataset on few-shot and continual learning point cloud classification tasks. Project Page: https://bigdatavision.org/3DGrocery100/.

A Benchmark Grocery Dataset of Realworld Point Clouds From Single View

TL;DR

3DGrocery100 introduces the largest real-world 3D grocery benchmark built from RGB-D single-view captures, yielding 87,898 point clouds across 100 fine-grained classes. The dataset enables rigorous evaluation of 3D point-cloud classifiers, few-shot generalization via the 63-class subset 3DGrocery63, and class-incremental learning with a LWF-based baseline, across color and no-color variants. Across multiple architectures (PointNet, PointNet++, DGCNN, PCT, PointMLP, PointNeXt), the work demonstrates strong color-based discriminability and highlights challenges in cross-domain and continual learning scenarios, providing valuable baselines and insights for real-world grocery recognition. The work also offers a pre-training subset (Packages) and extensive supplementary materials to support researchers in 3D grocery perception, with project resources at the provided page, underscoring its practical impact for automatic checkout, robotic navigation, and assistive technologies.

Abstract

Fine-grained grocery object recognition is an important computer vision problem with broad applications in automatic checkout, in-store robotic navigation, and assistive technologies for the visually impaired. Existing datasets on groceries are mainly 2D images. Models trained on these datasets are limited to learning features from the regular 2D grids. While portable 3D sensors such as Kinect were commonly available for mobile phones, sensors such as LiDAR and TrueDepth, have recently been integrated into mobile phones. Despite the availability of mobile 3D sensors, there are currently no dedicated real-world large-scale benchmark 3D datasets for grocery. In addition, existing 3D datasets lack fine-grained grocery categories and have limited training samples. Furthermore, collecting data by going around the object versus the traditional photo capture makes data collection cumbersome. Thus, we introduce a large-scale grocery dataset called 3DGrocery100. It constitutes 100 classes, with a total of 87,898 3D point clouds created from 10,755 RGB-D single-view images. We benchmark our dataset on six recent state-of-the-art 3D point cloud classification models. Additionally, we also benchmark the dataset on few-shot and continual learning point cloud classification tasks. Project Page: https://bigdatavision.org/3DGrocery100/.
Paper Structure (30 sections, 1 equation, 22 figures, 13 tables)

This paper contains 30 sections, 1 equation, 22 figures, 13 tables.

Figures (22)

  • Figure 1: 3DGrocery100 Dataset Statistics. The dataset constitutes 10,755 RGB-D images and 87,898 point clouds spread across 100 classes. At a high level, the groceries are categorized into Fruits (10 apple and 24 non-apple classes), Vegetables (28), and Packages (38). Note: Apples are a subset of Fruits. Non-apples fruit RGB-D image count: 2,586; point cloud count: 24,682.
  • Figure 2: (1) iOS app to capture an RGB image and a Depth Image (darker regions nearer to the camera) for a class (2) annotations on the RGB image - coffee instances numbered [a-e], (3) [a-e] 1024 Farthest point sampled 3D points of 5 coffee objects with and without colors.
  • Figure 3: (a-j) point clouds of apple-evercrisp, apple-fuji, apple-golden-delicious, apple-granny-smith, apple-honeycrisp, apple-pazazz, apple-pink-lady, apple-red-delicious, apple-royal-gala, and apple-wild-twist with color. (k-t) point clouds in the same order without color. Colors play a significant role in distinguishing objects from different classes. Without color, all ten classes of apples appear very similar in the 3D point cloud representation.
  • Figure 4: 3D point cloud representations of 24 (of the 34) classes of fruits (non-apples) with colors. The sets (bartlett-pear, donjau-pear, red-pear, pear-bosc), (cantaloupe, watermelon, honeydew-melon), (lemon, lime), (nectarine, peach, plum), and (grapefruit, navel-orange) are similar in shapes, and these classes get merged in case of 3DGrocery63. 28 classes of Vegetables with colors. 38 classes Packages with colors. *-bp is bell-pepper and *-s is spaghetti. The first number represents the class ID in 3DGrocery100, and the number in the parenthesis represents the class ID in 3DGrocery63 obtained after merging similar-shaped classes in 3DGrocery100.
  • Figure 5: 34 Fruit Classes: each with the count of images.
  • ...and 17 more figures