Table of Contents
Fetching ...

MoMa: A Modular Deep Learning Framework for Material Property Prediction

Botian Wang, Yawen Ouyang, Yaohui Li, Yiqun Wang, Haorui Cui, Jianbing Zhang, Xiaonan Wang, Wei-Ying Ma, Hao Zhou

TL;DR

MoMa presents a modular deep learning framework to address the diversity and disparity of material property prediction tasks. It creates a MoMa Hub of task-specific modules trained on a broad set of high-resource material properties and employs Adaptive Module Composition to tailor a synergistic subset for any downstream task, followed by task-specific fine-tuning. Across 17 downstream datasets, MoMa achieves substantial improvements (average ~14% over the strongest baselines) and demonstrates strong few-shot and continual-learning performance, highlighting data efficiency and scalability. By enabling privacy-preserving module sharing and interpretable module relevance, MoMa enables rapid, collaborative materials discovery with practical impact on energy, electronics, and manufacturing applications.

Abstract

Deep learning methods for material property prediction have been widely explored to advance materials discovery. However, the prevailing pre-train then fine-tune paradigm often fails to address the inherent diversity and disparity of material tasks. To overcome these challenges, we introduce MoMa, a Modular framework for Materials that first trains specialized modules across a wide range of tasks and then adaptively composes synergistic modules tailored to each downstream scenario. Evaluation across 17 datasets demonstrates the superiority of MoMa, with a substantial 14% average improvement over the strongest baseline. Few-shot and continual learning experiments further highlight MoMa's potential for real-world applications. Pioneering a new paradigm of modular material learning, MoMa will be open-sourced to foster broader community collaboration.

MoMa: A Modular Deep Learning Framework for Material Property Prediction

TL;DR

MoMa presents a modular deep learning framework to address the diversity and disparity of material property prediction tasks. It creates a MoMa Hub of task-specific modules trained on a broad set of high-resource material properties and employs Adaptive Module Composition to tailor a synergistic subset for any downstream task, followed by task-specific fine-tuning. Across 17 downstream datasets, MoMa achieves substantial improvements (average ~14% over the strongest baselines) and demonstrates strong few-shot and continual-learning performance, highlighting data efficiency and scalability. By enabling privacy-preserving module sharing and interpretable module relevance, MoMa enables rapid, collaborative materials discovery with practical impact on energy, electronics, and manufacturing applications.

Abstract

Deep learning methods for material property prediction have been widely explored to advance materials discovery. However, the prevailing pre-train then fine-tune paradigm often fails to address the inherent diversity and disparity of material tasks. To overcome these challenges, we introduce MoMa, a Modular framework for Materials that first trains specialized modules across a wide range of tasks and then adaptively composes synergistic modules tailored to each downstream scenario. Evaluation across 17 datasets demonstrates the superiority of MoMa, with a substantial 14% average improvement over the strongest baseline. Few-shot and continual learning experiments further highlight MoMa's potential for real-world applications. Pioneering a new paradigm of modular material learning, MoMa will be open-sourced to foster broader community collaboration.

Paper Structure

This paper contains 45 sections, 5 equations, 7 figures, 4 tables, 1 algorithm.

Figures (7)

  • Figure 1: Illustration of the diversity of material properties (top) and systems (bottom). Note that material tasks are also disparate, with different laws governing the diverse properties and systems. These characteristics pose challenges for pre-training material property prediction models.
  • Figure 2: A comparison between the pre-train fine-tune paradigm and MoMa's modular framework. (left): The prevailing scheme involves pre-training on force field data (with supervised prediction on energy, force, and stress), and then transfer to downstream tasks. (right): The modular learning scheme in MoMa trains and stores a broad spectrum of material tasks as modules, and adaptively composes them given a new material property prediction task.
  • Figure 3: The MoMa framework. (a) During the Module Training & Centralization stage (\ref{['sec:DMT']}), MoMa trains full and adapter modules for a wide spectrum of material tasks, constituting the MoMa Hub; (b) The Adaptive Module Composition (AMC) & Fine-tuning stage (\ref{['sec:AMC']}) leverages the modules in MoMa Hub to compose a tailored module for each downstream task. The AMC algorithm comprises three steps: 1. module prediction estimation (with $k$NN); 2. module weight optimization; 3. module composition. The composed module is further fine-tuned on the task for better adaptation.
  • Figure 4: Ablation study of AMC. The main results using AMC (purple) are compared with the ablated variants (orange) that substitute AMC with select average, all average and random selection. The axis represents the MAE on each dataset and smaller area is better. The ablated results are inferior to the main results in 13, 15 and 15 out of 17 tasks.
  • Figure 5: The average test losses of MoMa and JMP-FT across 17 downstream tasks under varying data availability settings. MoMa consistently outperforms JMP-FT in all settings. The loss reduction amplifies as the data size shrinks, highlighting the advantage of MoMa in few-shot settings. Results are averaged over five random data splits.
  • ...and 2 more figures