Table of Contents
Fetching ...

Machine-learning-enabled interpretation of tribological deformation patterns in large-scale MD data

Hendrik J. Ehrich, Marvin C. May, Stefan J. Eder

TL;DR

The paper tackles the challenge of translating high-dimensional molecular-dynamics tribology data into interpretable deformation patterns. It introduces a data-driven workflow that combines a 32-dimensional latent representation from a shallow autoencoder with a dual-branch CNN–MLP that fuses images and simulation metadata to classify deformation mechanisms, achieving high validation accuracy and robust generalization to unseen microstructures. The study provides a proof-of-concept for automated, data-driven tribological mechanism maps and outlines a path toward predictive, AI-augmented MD workflows that could reduce costly simulation campaigns. It also discusses interpretability via SHAP and identifies future directions in richer data representations and AutoML-driven architecture optimization.

Abstract

Molecular dynamics (MD) simulations have become indispensable for exploring tribological deformation patterns at the atomic scale. However, transforming the resulting high-dimensional data into interpretable deformation pattern maps remains a resource-intensive and largely manual process. In this work, we introduce a data-driven workflow that automates this interpretation step using unsupervised and supervised learning. Grain-orientation-colored computational tomograph pictures obtained from CuNi alloy simulations were first compressed through an autoencoder to a 32-dimensional global feature vector. Despite this strong compression, the reconstructed images retained the essential microstructural motifs: grain boundaries, stacking faults, twins, and partial lattice rotations, while omitting only the finest defects. The learned representations were then combined with simulation metadata (composition, load, time, temperature, and spatial position) to train a CNN-MLP model to predict the dominant deformation pattern. The resulting model achieves a prediction accuracy of approximately 96% on validation data. A refined evaluation strategy, in which an entire spatial region containing distinct grains was excluded from training, provides a more robust measure of generalization. The approach demonstrates that essential tribological deformation signatures can be automatically identified and classified from structural images using Machine Learning. This proof of concept constitutes a first step towards fully automated, data-driven construction of tribological mechanism maps and, ultimately, toward predictive modeling frameworks that may reduce the need for large-scale MD simulation campaigns.

Machine-learning-enabled interpretation of tribological deformation patterns in large-scale MD data

TL;DR

The paper tackles the challenge of translating high-dimensional molecular-dynamics tribology data into interpretable deformation patterns. It introduces a data-driven workflow that combines a 32-dimensional latent representation from a shallow autoencoder with a dual-branch CNN–MLP that fuses images and simulation metadata to classify deformation mechanisms, achieving high validation accuracy and robust generalization to unseen microstructures. The study provides a proof-of-concept for automated, data-driven tribological mechanism maps and outlines a path toward predictive, AI-augmented MD workflows that could reduce costly simulation campaigns. It also discusses interpretability via SHAP and identifies future directions in richer data representations and AutoML-driven architecture optimization.

Abstract

Molecular dynamics (MD) simulations have become indispensable for exploring tribological deformation patterns at the atomic scale. However, transforming the resulting high-dimensional data into interpretable deformation pattern maps remains a resource-intensive and largely manual process. In this work, we introduce a data-driven workflow that automates this interpretation step using unsupervised and supervised learning. Grain-orientation-colored computational tomograph pictures obtained from CuNi alloy simulations were first compressed through an autoencoder to a 32-dimensional global feature vector. Despite this strong compression, the reconstructed images retained the essential microstructural motifs: grain boundaries, stacking faults, twins, and partial lattice rotations, while omitting only the finest defects. The learned representations were then combined with simulation metadata (composition, load, time, temperature, and spatial position) to train a CNN-MLP model to predict the dominant deformation pattern. The resulting model achieves a prediction accuracy of approximately 96% on validation data. A refined evaluation strategy, in which an entire spatial region containing distinct grains was excluded from training, provides a more robust measure of generalization. The approach demonstrates that essential tribological deformation signatures can be automatically identified and classified from structural images using Machine Learning. This proof of concept constitutes a first step towards fully automated, data-driven construction of tribological mechanism maps and, ultimately, toward predictive modeling frameworks that may reduce the need for large-scale MD simulation campaigns.

Paper Structure

This paper contains 18 sections, 1 equation, 11 figures.

Figures (11)

  • Figure 1: Schematic of conventional MD data analysis. The data distillation workflow follows the tilde-shaped background from (a,b,c,d) to (i), based on the representative example of statistically evaluating the emergence of twin boundaries (TB) in the microstructure. A fully developed ML model could replace all the intermediate steps up to the deformation pattern map.
  • Figure 2: 3D MD system setup overview sketching out the location of the tomographic slices that form the image basis for the ML training data.
  • Figure 3: a) Deformation pattern map from eder_unraveling_2020 as a function of Ni fraction and normal pressure, with the deformation regimes that serve as labels for the final states and the parameter combinations of the MD simulations marked as red circles. b) Exemplary intermediate plots of relevant time-dependent quantities (top: grain boundary fraction, transition times labeled; bottom: shear layer depth, no distinct transition times) used to assign labels to the transient states.
  • Figure 4: Undercomplete autoencoder with a shallow encoder–decoder design. A $5\times5$ stride-2 convolution and a residual block extract features and reduce spatial resolution before flattening into a 32-dimensional latent space. The decoder expands the latent vector via a dense projection, reshaping, and a stride-2 transposed convolution, followed by a final $3\times3$ sigmoid layer to reconstruct the image.
  • Figure 5: Two-branch neural network architecture combining numeric and categorical metadata and image data to predict two labels (4-class and 5-class). Image data is processed through a convolutional neural network, while metadata is handled via a Multi-Layer Perceptron (MLP). The outputs of both branches are concatenated and passed through additional fully connected layers to produce probabilistic predictions for both labels.
  • ...and 6 more figures