Table of Contents
Fetching ...

M^3ashy: Multi-Modal Material Synthesis via Hyperdiffusion

Chenliang Zhou, Zheyuan Hu, Alejandro Sztrajman, Yancheng Cai, Yaru Liu, Cengiz Oztireli

TL;DR

This work tackles the challenge of synthesizing real-world measured BRDFs by introducing M^3ashy, a multi-modal diffusion framework that uses neural fields as a compact BRDF representation. The pipeline consists of data augmentation to create AugMERL, fitting neural-field representations to form NeuMERL, and training a transformer-based hyperdiffusion model to enable unconditional, multi-modal conditional, and constrained material synthesis. It contributes two datasets (AugMERL and NeuMERL), three BRDF distributional metrics (MMD, COV, 1-NNA), and a constrained synthesis mechanism to guide outputs by material category. The approach enables controllable, high-fidelity material generation across inputs such as material type, textual descriptions, and reference images, with demonstrated improvements over baselines and broader potential for rendering and material understanding.

Abstract

High-quality material synthesis is essential for replicating complex surface properties to create realistic scenes. Despite advances in the generation of material appearance based on analytic models, the synthesis of real-world measured BRDFs remains largely unexplored. To address this challenge, we propose M^3ashy, a novel multi-modal material synthesis framework based on hyperdiffusion. M^3ashy enables high-quality reconstruction of complex real-world materials by leveraging neural fields as a compact continuous representation of BRDFs. Furthermore, our multi-modal conditional hyperdiffusion model allows for flexible material synthesis conditioned on material type, natural language descriptions, or reference images, providing greater user control over material generation. To support future research, we contribute two new material datasets and introduce two BRDF distributional metrics for more rigorous evaluation. We demonstrate the effectiveness of Mashy through extensive experiments, including a novel statistics-based constrained synthesis, which enables the generation of materials of desired categories.

M^3ashy: Multi-Modal Material Synthesis via Hyperdiffusion

TL;DR

This work tackles the challenge of synthesizing real-world measured BRDFs by introducing M^3ashy, a multi-modal diffusion framework that uses neural fields as a compact BRDF representation. The pipeline consists of data augmentation to create AugMERL, fitting neural-field representations to form NeuMERL, and training a transformer-based hyperdiffusion model to enable unconditional, multi-modal conditional, and constrained material synthesis. It contributes two datasets (AugMERL and NeuMERL), three BRDF distributional metrics (MMD, COV, 1-NNA), and a constrained synthesis mechanism to guide outputs by material category. The approach enables controllable, high-fidelity material generation across inputs such as material type, textual descriptions, and reference images, with demonstrated improvements over baselines and broader potential for rendering and material understanding.

Abstract

High-quality material synthesis is essential for replicating complex surface properties to create realistic scenes. Despite advances in the generation of material appearance based on analytic models, the synthesis of real-world measured BRDFs remains largely unexplored. To address this challenge, we propose M^3ashy, a novel multi-modal material synthesis framework based on hyperdiffusion. M^3ashy enables high-quality reconstruction of complex real-world materials by leveraging neural fields as a compact continuous representation of BRDFs. Furthermore, our multi-modal conditional hyperdiffusion model allows for flexible material synthesis conditioned on material type, natural language descriptions, or reference images, providing greater user control over material generation. To support future research, we contribute two new material datasets and introduce two BRDF distributional metrics for more rigorous evaluation. We demonstrate the effectiveness of Mashy through extensive experiments, including a novel statistics-based constrained synthesis, which enables the generation of materials of desired categories.

Paper Structure

This paper contains 48 sections, 25 equations, 25 figures, 4 tables, 1 algorithm.

Figures (25)

  • Figure 1: 3D models and scenes rendered with our synthesized neural materials demonstrate visually rich results.
  • Figure 2: An overview of M3ashy, our novel neural material synthesis framework, consisting of three main stages. 1 (top left): Data augmentation using RGB permutation and PCA interpolation to create an expanded dataset, AugMERL; 2 (middle): Neural field fitted to individual materials, resulting in NeuMERL, a dataset of neural material representations; and 3 (bottom): Training a multi-modal conditional hyperdiffusion on NeuMERL to enable conditional synthesis of high-quality, diverse materials guided by inputs such as material type, text descriptions, or reference images. We further propose a novel statistics-based constrained synthesis method to generate materials of a specified type (top right).
  • Figure 3: Six RGB permutations of the MERL material blue acrylic. (a) represents the original material. This permutation strategy expands the dataset by a factor of 6.
  • Figure 4: Linear interpolation of two MERL materials, (a) green metallic paint and (f) yellow plastic, in the PCA space.
  • Figure 5: Material synthesis. Baseline models fail to capture the underlying distribution effectively, resulting in homogeneous outputs or severe artifacts. In contrast, M3ashy successfully captures the complex neural material distribution, achieving significantly better fidelity and diversity. Our materials also support spatially varying rendering configurations (last three columns).
  • ...and 20 more figures