Table of Contents
Fetching ...

Err on the Side of Texture: Texture Bias on Real Data

Blaine Hoak, Ryan Sheatsley, Patrick McDaniel

TL;DR

This work addresses the pervasive texture bias in image classification by introducing Texture Association Value (TAV), a metric computed from texture–object associations learned by models using a broad texture dataset (PTD). It further develops Texture Identification (TID) to detect textures present in real images by matching model outputs to TAV rows, enabling texture-aware analysis on real data and natural adversarial examples (ImageNet-A). Across pretrained models and datasets, the study demonstrates that textures strongly influence both accuracy and confidence, with dominant textures driving much of the correct predictions and texture misalignment explaining many confident mispredictions in natural adversarial examples (over 90% of such samples). The results highlight the need for balanced texture–shape reasoning and provide a scalable framework, along with code, to assess and mitigate texture-driven vulnerabilities in real-world vision systems.

Abstract

Bias significantly undermines both the accuracy and trustworthiness of machine learning models. To date, one of the strongest biases observed in image classification models is texture bias-where models overly rely on texture information rather than shape information. Yet, existing approaches for measuring and mitigating texture bias have not been able to capture how textures impact model robustness in real-world settings. In this work, we introduce the Texture Association Value (TAV), a novel metric that quantifies how strongly models rely on the presence of specific textures when classifying objects. Leveraging TAV, we demonstrate that model accuracy and robustness are heavily influenced by texture. Our results show that texture bias explains the existence of natural adversarial examples, where over 90% of these samples contain textures that are misaligned with the learned texture of their true label, resulting in confident mispredictions.

Err on the Side of Texture: Texture Bias on Real Data

TL;DR

This work addresses the pervasive texture bias in image classification by introducing Texture Association Value (TAV), a metric computed from texture–object associations learned by models using a broad texture dataset (PTD). It further develops Texture Identification (TID) to detect textures present in real images by matching model outputs to TAV rows, enabling texture-aware analysis on real data and natural adversarial examples (ImageNet-A). Across pretrained models and datasets, the study demonstrates that textures strongly influence both accuracy and confidence, with dominant textures driving much of the correct predictions and texture misalignment explaining many confident mispredictions in natural adversarial examples (over 90% of such samples). The results highlight the need for balanced texture–shape reasoning and provide a scalable framework, along with code, to assess and mitigate texture-driven vulnerabilities in real-world vision systems.

Abstract

Bias significantly undermines both the accuracy and trustworthiness of machine learning models. To date, one of the strongest biases observed in image classification models is texture bias-where models overly rely on texture information rather than shape information. Yet, existing approaches for measuring and mitigating texture bias have not been able to capture how textures impact model robustness in real-world settings. In this work, we introduce the Texture Association Value (TAV), a novel metric that quantifies how strongly models rely on the presence of specific textures when classifying objects. Leveraging TAV, we demonstrate that model accuracy and robustness are heavily influenced by texture. Our results show that texture bias explains the existence of natural adversarial examples, where over 90% of these samples contain textures that are misaligned with the learned texture of their true label, resulting in confident mispredictions.

Paper Structure

This paper contains 28 sections, 7 equations, 47 figures, 3 tables.

Figures (47)

  • Figure 1: ImageNet-A hendrycks_natural_2021 examples misclassified as honeycombs on ResNet50.
  • Figure 2: A subset of the TAV matrix.
  • Figure 3: Images from the ImageNet validation set identified as having grid textures.
  • Figure 4: Average agreement with human evaluators and number of samples evaluated for each predicted texture class. Horizontal line shows the overall agreement with human evaluators.
  • Figure 5: Samples labeled as having a "swirly" texture by human evaluators and a "flecked" texture by the TID.
  • ...and 42 more figures