Table of Contents
Fetching ...

On Synthetic Texture Datasets: Challenges, Creation, and Curation

Blaine Hoak, Patrick McDaniel

TL;DR

This work tackles the scarcity of large, diverse texture data by introducing the Prompted Textures Dataset (PTD), a 246,285-image collection across 56 textures generated via a prompt-driven diffusion pipeline. The authors design prompts from descriptor categories, generate images with Stable Diffusion, and apply a three-stage refinement (frequency, patch variance, CLIP alignment) to ensure high texture fidelity and prompt correspondence, while navigating NSFW filtering challenges. Comprehensive evaluation using standard metrics (Inception, FID), Fourier analysis, and human judgments demonstrates PTD’s superior diversity and texture realism compared to the Describable Textures Dataset, validating its suitability for texture-based tasks and texture-bias research. They further apply PTD to measure texture bias with the Texture Object Association Values (TAV) metric, revealing meaningful texture-object associations and establishing PTD as a practical tool for interpretability, bias analysis, and robustness studies in vision systems.

Abstract

The influence of textures on machine learning models has been an ongoing investigation, specifically in texture bias/learning, interpretability, and robustness. However, due to the lack of large and diverse texture data available, the findings in these works have been limited, as more comprehensive evaluations have not been feasible. Image generative models are able to provide data creation at scale, but utilizing these models for texture synthesis has been unexplored and poses additional challenges both in creating accurate texture images and validating those images. In this work, we introduce an extensible methodology and corresponding new dataset for generating high-quality, diverse texture images capable of supporting a broad set of texture-based tasks. Our pipeline consists of: (1) developing prompts from a range of descriptors to serve as input to text-to-image models, (2) adopting and adapting Stable Diffusion pipelines to generate and filter the corresponding images, and (3) further filtering down to the highest quality images. Through this, we create the Prompted Textures Dataset (PTD), a dataset of 246,285 texture images that span 56 textures. During the process of generating images, we find that NSFW safety filters in image generation pipelines are highly sensitive to texture (and flag up to 60\% of our texture images), uncovering a potential bias in these models and presenting unique challenges when working with texture data. Through both standard metrics and a human evaluation, we find that our dataset is high quality and diverse. Our dataset is available for download at https://zenodo.org/records/15359142.

On Synthetic Texture Datasets: Challenges, Creation, and Curation

TL;DR

This work tackles the scarcity of large, diverse texture data by introducing the Prompted Textures Dataset (PTD), a 246,285-image collection across 56 textures generated via a prompt-driven diffusion pipeline. The authors design prompts from descriptor categories, generate images with Stable Diffusion, and apply a three-stage refinement (frequency, patch variance, CLIP alignment) to ensure high texture fidelity and prompt correspondence, while navigating NSFW filtering challenges. Comprehensive evaluation using standard metrics (Inception, FID), Fourier analysis, and human judgments demonstrates PTD’s superior diversity and texture realism compared to the Describable Textures Dataset, validating its suitability for texture-based tasks and texture-bias research. They further apply PTD to measure texture bias with the Texture Object Association Values (TAV) metric, revealing meaningful texture-object associations and establishing PTD as a practical tool for interpretability, bias analysis, and robustness studies in vision systems.

Abstract

The influence of textures on machine learning models has been an ongoing investigation, specifically in texture bias/learning, interpretability, and robustness. However, due to the lack of large and diverse texture data available, the findings in these works have been limited, as more comprehensive evaluations have not been feasible. Image generative models are able to provide data creation at scale, but utilizing these models for texture synthesis has been unexplored and poses additional challenges both in creating accurate texture images and validating those images. In this work, we introduce an extensible methodology and corresponding new dataset for generating high-quality, diverse texture images capable of supporting a broad set of texture-based tasks. Our pipeline consists of: (1) developing prompts from a range of descriptors to serve as input to text-to-image models, (2) adopting and adapting Stable Diffusion pipelines to generate and filter the corresponding images, and (3) further filtering down to the highest quality images. Through this, we create the Prompted Textures Dataset (PTD), a dataset of 246,285 texture images that span 56 textures. During the process of generating images, we find that NSFW safety filters in image generation pipelines are highly sensitive to texture (and flag up to 60\% of our texture images), uncovering a potential bias in these models and presenting unique challenges when working with texture data. Through both standard metrics and a human evaluation, we find that our dataset is high quality and diverse. Our dataset is available for download at https://zenodo.org/records/15359142.
Paper Structure (24 sections, 1 equation, 6 figures, 3 tables)

This paper contains 24 sections, 1 equation, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Examples of images flagged as NSFW.
  • Figure 2: Ratio of total images flagged as NSFW (red) and ratio of prompts with at least one flagged image (blue), organized by word present in the prompt.
  • Figure 3: Inception and FID Scores of each texture class in PTD (ours) and DTDcimpoi_describing_2014. Classes are sorted by mean Inception Score.
  • Figure 4: Mean power spectrum of DTD (left) and PTD (right).
  • Figure 5: Average human representative score for all images at or below a given CLIP score quantile cutoff.
  • ...and 1 more figures