Table of Contents
Fetching ...

StableMaterials: Enhancing Diversity in Material Generation via Semi-Supervised Learning

Giuseppe Vecchio

TL;DR

StableMaterials presents a diffusion-based framework for fast, tileable PBR material generation that leverages semi-supervised adversarial distillation from large-scale image models to overcome annotated-data limitations. By learning a shared latent space for textures and materials, and distilling knowledge from SDXL through a latent discriminator, the method achieves greater diversity while maintaining physical plausibility. A Latent Consistency Model enables few-step generation, and a feature-rolling technique ensures tileable outputs with minimal artifacts, complemented by a diffusion-based refiner for high-resolution outputs. Experiments on combined MatSynth-deschaintre data demonstrate strong qualitative and CLIP-based quantitative performance against state-of-the-art methods, with ablations validating design choices and highlighting practical efficiency gains for real-world graphics pipelines.

Abstract

We introduce StableMaterials, a novel approach for generating photorealistic physical-based rendering (PBR) materials that integrate semi-supervised learning with Latent Diffusion Models (LDMs). Our method employs adversarial training to distill knowledge from existing large-scale image generation models, minimizing the reliance on annotated data and enhancing the diversity in generation. This distillation approach aligns the distribution of the generated materials with that of image textures from an SDXL model, enabling the generation of novel materials that are not present in the initial training dataset. Furthermore, we employ a diffusion-based refiner model to improve the visual quality of the samples and achieve high-resolution generation. Finally, we distill a latent consistency model for fast generation in just four steps and propose a new tileability technique that removes visual artifacts typically associated with fewer diffusion steps. We detail the architecture and training process of StableMaterials, the integration of semi-supervised training within existing LDM frameworks and show the advantages of our approach. Comparative evaluations with state-of-the-art methods show the effectiveness of StableMaterials, highlighting its potential applications in computer graphics and beyond. StableMaterials is publicly available at https://gvecchio.com/stablematerials.

StableMaterials: Enhancing Diversity in Material Generation via Semi-Supervised Learning

TL;DR

StableMaterials presents a diffusion-based framework for fast, tileable PBR material generation that leverages semi-supervised adversarial distillation from large-scale image models to overcome annotated-data limitations. By learning a shared latent space for textures and materials, and distilling knowledge from SDXL through a latent discriminator, the method achieves greater diversity while maintaining physical plausibility. A Latent Consistency Model enables few-step generation, and a feature-rolling technique ensures tileable outputs with minimal artifacts, complemented by a diffusion-based refiner for high-resolution outputs. Experiments on combined MatSynth-deschaintre data demonstrate strong qualitative and CLIP-based quantitative performance against state-of-the-art methods, with ablations validating design choices and highlighting practical efficiency gains for real-world graphics pipelines.

Abstract

We introduce StableMaterials, a novel approach for generating photorealistic physical-based rendering (PBR) materials that integrate semi-supervised learning with Latent Diffusion Models (LDMs). Our method employs adversarial training to distill knowledge from existing large-scale image generation models, minimizing the reliance on annotated data and enhancing the diversity in generation. This distillation approach aligns the distribution of the generated materials with that of image textures from an SDXL model, enabling the generation of novel materials that are not present in the initial training dataset. Furthermore, we employ a diffusion-based refiner model to improve the visual quality of the samples and achieve high-resolution generation. Finally, we distill a latent consistency model for fast generation in just four steps and propose a new tileability technique that removes visual artifacts typically associated with fewer diffusion steps. We detail the architecture and training process of StableMaterials, the integration of semi-supervised training within existing LDM frameworks and show the advantages of our approach. Comparative evaluations with state-of-the-art methods show the effectiveness of StableMaterials, highlighting its potential applications in computer graphics and beyond. StableMaterials is publicly available at https://gvecchio.com/stablematerials.
Paper Structure (35 sections, 2 equations, 11 figures, 2 tables)

This paper contains 35 sections, 2 equations, 11 figures, 2 tables.

Figures (11)

  • Figure 1: Architecture of StableMaterials. The base model generates a low resolution materials of size 512x512. This generation is then upscaled and refined using SDEdit meng2021sdedit by the refiner model using a patched approach to limit memory requirements.
  • Figure 2: Semi-Supervised training. Both the SDXL model and StableMaterials are prompted to generate the same material. The supervised $\mathcal{L}_2$ loss, between the estimated noise and the added noise, is complemented by an adversarial loss $\mathcal{L}_{\text{adv}}$ computed on the denoised latent from StableMaterials.
  • Figure 3: Image-prompting. StableMaterials accurately captures the visual appearance of the input image, producing realistic materials for both in-domain (on the left) and out-domain (on the right) prompts. The render highlights the model's ability to handle diverse and complex surfaces.
  • Figure 4: Text-prompting. StableMaterials closely follow the input prompt, producing realistic materials for both in-domain (on the left) and out-domain (on the right) samples. The render highlights the model's ability to generate accurate properties for different types of materials.
  • Figure 5: Comparison for image-prompting. We compare StableMaterials with MatFuse, MatGen, and MaterialPalette in image-prompted generation, showing two in-domain (left column) and two out-domain (right column) renderings per model. StableMaterials improves over previous methods quality and ability to captures the visual appearance of the input image. Full set of maps in Supplemental Materials.
  • ...and 6 more figures