Table of Contents
Fetching ...

Data Quality Aware Approaches for Addressing Model Drift of Semantic Segmentation Models

Samiha Mirza, Vuong D. Nguyen, Pranav Mantini, Shishir K. Shah

TL;DR

The paper tackles model drift in semantic segmentation caused by data distribution and quality shifts during retraining. It proposes two quality-aware data-selection approaches: data quality assessment using no-reference IQA BRISQUE and data conditioning using learned feature vectors with an SVM guided selection. Through experiments on ADE20K, Cityscapes, and VOC with a U-Net, it shows that BRISQUE-based filtering improves performance when incorporating distorted data, and the feature-vector method improves Dice and F-score, indicating robust data-curation strategies. These results offer practical tools to sustain segmentation accuracy in dynamic real-world environments.

Abstract

In the midst of the rapid integration of artificial intelligence (AI) into real world applications, one pressing challenge we confront is the phenomenon of model drift, wherein the performance of AI models gradually degrades over time, compromising their effectiveness in real-world, dynamic environments. Once identified, we need techniques for handling this drift to preserve the model performance and prevent further degradation. This study investigates two prominent quality aware strategies to combat model drift: data quality assessment and data conditioning based on prior model knowledge. The former leverages image quality assessment metrics to meticulously select high-quality training data, improving the model robustness, while the latter makes use of learned feature vectors from existing models to guide the selection of future data, aligning it with the model's prior knowledge. Through comprehensive experimentation, this research aims to shed light on the efficacy of these approaches in enhancing the performance and reliability of semantic segmentation models, thereby contributing to the advancement of computer vision capabilities in real-world scenarios.

Data Quality Aware Approaches for Addressing Model Drift of Semantic Segmentation Models

TL;DR

The paper tackles model drift in semantic segmentation caused by data distribution and quality shifts during retraining. It proposes two quality-aware data-selection approaches: data quality assessment using no-reference IQA BRISQUE and data conditioning using learned feature vectors with an SVM guided selection. Through experiments on ADE20K, Cityscapes, and VOC with a U-Net, it shows that BRISQUE-based filtering improves performance when incorporating distorted data, and the feature-vector method improves Dice and F-score, indicating robust data-curation strategies. These results offer practical tools to sustain segmentation accuracy in dynamic real-world environments.

Abstract

In the midst of the rapid integration of artificial intelligence (AI) into real world applications, one pressing challenge we confront is the phenomenon of model drift, wherein the performance of AI models gradually degrades over time, compromising their effectiveness in real-world, dynamic environments. Once identified, we need techniques for handling this drift to preserve the model performance and prevent further degradation. This study investigates two prominent quality aware strategies to combat model drift: data quality assessment and data conditioning based on prior model knowledge. The former leverages image quality assessment metrics to meticulously select high-quality training data, improving the model robustness, while the latter makes use of learned feature vectors from existing models to guide the selection of future data, aligning it with the model's prior knowledge. Through comprehensive experimentation, this research aims to shed light on the efficacy of these approaches in enhancing the performance and reliability of semantic segmentation models, thereby contributing to the advancement of computer vision capabilities in real-world scenarios.
Paper Structure (19 sections, 8 equations, 6 figures, 6 tables)

This paper contains 19 sections, 8 equations, 6 figures, 6 tables.

Figures (6)

  • Figure 1: Occurrence of model drift. It may happen due to a variety of reasons, with one major reason being the degradation of the quality of the new data.
  • Figure 2: Comparison of model predictions between the original model, initially trained with initial datasets, and the updated model, which incorporates both the initial datasets and newly acquired data. The updated model exhibits an increased occurrence of false positives and a corresponding reduction in F-score values when compared to the original model.
  • Figure 3: Incorporating quality-aware data selection into the ML pipeline: Existing model $M_{old}$ is trained with $S_{old}$, and new data $S_{new}$ needs integration. Criterion $\mathcal{G}$ explores data selection from $S_{new}$ without causing negative effects, preventing model drift. Two criteria are investigated: data quality-based and feature learning-based. We compare performance between $M_{old}$ and updated model $M_{new}$.
  • Figure 4: Illustration of the proposed approach. In the feature learning module, the feature vector representation are first learnt using an SVM on the prior segmentation network in production. Then any new data that needs to be added for retraining the segmentation net, passes through the feature learning module and is fit to the trained SVM to predict if it must be selected for further model training or discarded.
  • Figure 5: Distribution of original images, $D_{original}$ BRISQUE scores and BRISQUE scores for images after image degradation ($D_{distorted}$). Lower BRISQUE score indicate better quality. The original images distribution reflects an overall high perceptual quality images. After degradation, we see an overall increase in BRISQUE due to a poor quality.
  • ...and 1 more figures