Table of Contents
Fetching ...

Learning to Classify New Foods Incrementally Via Compressed Exemplars

Justin Yang, Zhihao Duan, Jiangpeng He, Fengqing Zhu

TL;DR

The paper tackles the problem of adapting food image classifiers to evolving class sets under memory constraints, addressing catastrophic forgetting in class-incremental learning. It proposes a plug-in framework that integrates a continual neural compressor with fixed-decoder training and CAM-guided foreground preservation to store more diverse exemplars in the memory buffer while mitigating domain shift. The approach demonstrates improved classification accuracy on Food-101 and ImageNet-100 and shows meaningful gains on VFN-74 through ablations, highlighting the value of memory-efficient exemplar management in continual learning. This work enables storage-efficient, on-device lifelong learning for dynamic food recognition and offers methods with potential benefits in broader continual learning domains.

Abstract

Food image classification systems play a crucial role in health monitoring and diet tracking through image-based dietary assessment techniques. However, existing food recognition systems rely on static datasets characterized by a pre-defined fixed number of food classes. This contrasts drastically with the reality of food consumption, which features constantly changing data. Therefore, food image classification systems should adapt to and manage data that continuously evolves. This is where continual learning plays an important role. A challenge in continual learning is catastrophic forgetting, where ML models tend to discard old knowledge upon learning new information. While memory-replay algorithms have shown promise in mitigating this problem by storing old data as exemplars, they are hampered by the limited capacity of memory buffers, leading to an imbalance between new and previously learned data. To address this, our work explores the use of neural image compression to extend buffer size and enhance data diversity. We introduced the concept of continuously learning a neural compression model to adaptively improve the quality of compressed data and optimize the bitrates per pixel (bpp) to store more exemplars. Our extensive experiments, including evaluations on food-specific datasets including Food-101 and VFN-74, as well as the general dataset ImageNet-100, demonstrate improvements in classification accuracy. This progress is pivotal in advancing more realistic food recognition systems that are capable of adapting to continually evolving data. Moreover, the principles and methodologies we've developed hold promise for broader applications, extending their benefits to other domains of continual machine learning systems.

Learning to Classify New Foods Incrementally Via Compressed Exemplars

TL;DR

The paper tackles the problem of adapting food image classifiers to evolving class sets under memory constraints, addressing catastrophic forgetting in class-incremental learning. It proposes a plug-in framework that integrates a continual neural compressor with fixed-decoder training and CAM-guided foreground preservation to store more diverse exemplars in the memory buffer while mitigating domain shift. The approach demonstrates improved classification accuracy on Food-101 and ImageNet-100 and shows meaningful gains on VFN-74 through ablations, highlighting the value of memory-efficient exemplar management in continual learning. This work enables storage-efficient, on-device lifelong learning for dynamic food recognition and offers methods with potential benefits in broader continual learning domains.

Abstract

Food image classification systems play a crucial role in health monitoring and diet tracking through image-based dietary assessment techniques. However, existing food recognition systems rely on static datasets characterized by a pre-defined fixed number of food classes. This contrasts drastically with the reality of food consumption, which features constantly changing data. Therefore, food image classification systems should adapt to and manage data that continuously evolves. This is where continual learning plays an important role. A challenge in continual learning is catastrophic forgetting, where ML models tend to discard old knowledge upon learning new information. While memory-replay algorithms have shown promise in mitigating this problem by storing old data as exemplars, they are hampered by the limited capacity of memory buffers, leading to an imbalance between new and previously learned data. To address this, our work explores the use of neural image compression to extend buffer size and enhance data diversity. We introduced the concept of continuously learning a neural compression model to adaptively improve the quality of compressed data and optimize the bitrates per pixel (bpp) to store more exemplars. Our extensive experiments, including evaluations on food-specific datasets including Food-101 and VFN-74, as well as the general dataset ImageNet-100, demonstrate improvements in classification accuracy. This progress is pivotal in advancing more realistic food recognition systems that are capable of adapting to continually evolving data. Moreover, the principles and methodologies we've developed hold promise for broader applications, extending their benefits to other domains of continual machine learning systems.
Paper Structure (16 sections, 2 equations, 4 figures, 3 tables)

This paper contains 16 sections, 2 equations, 4 figures, 3 tables.

Figures (4)

  • Figure 1: Class-Incremental Learning (CIL) for Food Image Classification with Memory Replay. CIL models progressively learn new food categories presented in a sequential manner.A compact memory buffer retains a subset of previously encountered data, leading to a training dataset that evolves and potentially becomes imbalanced with each incremental training phase. Following each training phase, the performance of the CIL model is based on the classification accuracy across a balanced testing set containing all of the classes it has encountered so far.
  • Figure 2: Overview of our proposed method. Our proposed method is divided into two main components:: Compressor Training section and the CIL Classifier Training section. In the compressor training phase, we employ the fix-decoder strategy to train a neural compressor minnen2018joint using only the original data from the current phase. After the CIL classifier training process, we follow standard CIL setups iCaRL using herding to select representative exemplars. These exemplars are then processed through the fine-tuned encoder to generate a compressed exemplar, we also performed CAM-based foreground extraction and then composited them into the the final format in the memory buffer. During the CIL training phase, these compressed exemplar bits are decoded back into images and composite with the foreground image, then incorporated as historical samples within the training workflow, ensuring a continuity of learning that seamlessly integrates past knowledge with new data. Note that light-color images in the figure represent compressed versions of the images (denoted as 'Comp.' in the figure).
  • Figure 3: Results on VFN-74 dataset under LFH setting with exemplar set of 5 images per class using various CIL methods. The incremental step size $M \in \{5, 10\}$. We also included results with plug-in methods CIM CIM_CIL and our method.
  • Figure 4: Visualization of different compression methods applied to a food image from the 'steak' class.