Table of Contents
Fetching ...

MatSynth: A Modern PBR Materials Dataset

Giuseppe Vecchio, Valentin Deschaintre

TL;DR

The MatSynth dataset, a dataset of 4, 000+ CCO ultra-high resolution PBR materials, is introduced, a significantly larger, more di-verse, and higher resolution set of materials than previously publicly available.

Abstract

We introduce MatSynth, a dataset of 4,000+ CC0 ultra-high resolution PBR materials. Materials are crucial components of virtual relightable assets, defining the interaction of light at the surface of geometries. Given their importance, significant research effort was dedicated to their representation, creation and acquisition. However, in the past 6 years, most research in material acquisiton or generation relied either on the same unique dataset, or on company-owned huge library of procedural materials. With this dataset we propose a significantly larger, more diverse, and higher resolution set of materials than previously publicly available. We carefully discuss the data collection process and demonstrate the benefits of this dataset on material acquisition and generation applications. The complete data further contains metadata with each material's origin, license, category, tags, creation method and, when available, descriptions and physical size, as well as 3M+ renderings of the augmented materials, in 1K, under various environment lightings. The MatSynth dataset is released through the project page at: https://www.gvecchio.com/matsynth.

MatSynth: A Modern PBR Materials Dataset

TL;DR

The MatSynth dataset, a dataset of 4, 000+ CCO ultra-high resolution PBR materials, is introduced, a significantly larger, more di-verse, and higher resolution set of materials than previously publicly available.

Abstract

We introduce MatSynth, a dataset of 4,000+ CC0 ultra-high resolution PBR materials. Materials are crucial components of virtual relightable assets, defining the interaction of light at the surface of geometries. Given their importance, significant research effort was dedicated to their representation, creation and acquisition. However, in the past 6 years, most research in material acquisiton or generation relied either on the same unique dataset, or on company-owned huge library of procedural materials. With this dataset we propose a significantly larger, more diverse, and higher resolution set of materials than previously publicly available. We carefully discuss the data collection process and demonstrate the benefits of this dataset on material acquisition and generation applications. The complete data further contains metadata with each material's origin, license, category, tags, creation method and, when available, descriptions and physical size, as well as 3M+ renderings of the augmented materials, in 1K, under various environment lightings. The MatSynth dataset is released through the project page at: https://www.gvecchio.com/matsynth.
Paper Structure (21 sections, 7 figures, 3 tables)

This paper contains 21 sections, 7 figures, 3 tables.

Figures (7)

  • Figure 1: Samples of height based material blends. We show renderings of both materials being blended, the result of our blending, and the height-based computed mask.
  • Figure 2: Renderings under various environment maps. We show four materials (Metal, Leather, Plastic and Pebbles) from the dataset rendered under the 5 chosen environment maps.
  • Figure 3: Render samples using the two-pass strategy. This ensures that the maps and the rendering are well aligned, avoiding parrallax effects, but preserving specular highlights.
  • Figure 4: Categories distribution of MatSynth and Deschaintre18's dataset. As there are no duplicates in the two datasets, the two datasets can be joined to combine their benefits. We show here that our dataset significantly enriches the previously available data, reducing the inequalities in the category distributions of the previous dataset, in particular for Fabric of Ground categories.
  • Figure 5: Qualitative material acquisition comparison on synthetic data. We compare Deschaintre18 and SurfaceNet vecchio2021surfacenet trained only on Deschaintre18's dataset against the same methods trained on MatSynth (marked with *). We can see that the fine-tuned versions better match the Ground truth, in particular for SurfaceNet in the Normal and Roughness maps.
  • ...and 2 more figures