Table of Contents
Fetching ...

Data-Driven Dynamic Friction Models based on Recurrent Neural Networks

Gaëtan Cortes, Joaquin Garcia-Suarez

TL;DR

This work demonstrates that GRU-based recurrent neural networks can learn the dynamic evolution of rate-and-state friction by training on synthetic RSF data generated from aging or slip laws. A physics-informed loss, built with automatic differentiation, enforces key frictional behaviors such as the direct velocity effect and healing, enabling the network to predict friction variations under velocity jumps without explicit state variables. Results show that the GRU captures RSF-driven friction dynamics with mean test errors around 12% in noiseless data and around 17% on noisy data (median errors substantially lower), indicating robust performance even with limited training data. The approach points toward integrating data-driven friction models into larger-scale simulations, while acknowledging limitations in healing modeling and the need for additional physics-informed losses and real experimental data to improve generalization. Overall, the study highlights the potential of data-driven, history-aware neural models to complement or replace phenomenological RSF formulations in simulations of frictional interfaces.

Abstract

In this concise contribution, it is demonstrated that Recurrent Neural Networks (RNNs) based on Gated Recurrent Unit (GRU) architecture, possess the capability to learn the complex dynamics of rate-and-state friction (RSF) laws from synthetic data. The data employed for training the network is generated through the application of traditional RSF equations coupled with either the aging law or the slip law for state evolution. A novel aspect of this approach is the formulation of a loss function that explicitly accounts for the direct effect by means of automatic differentiation. It is found that the GRU-based RNNs effectively learns to predict changes in the friction coefficient resulting from velocity jumps (with and without noise in the target data), thereby showcasing the potential of machine learning models in capturing and simulating the physics of frictional processes. Current limitations and challenges are discussed.

Data-Driven Dynamic Friction Models based on Recurrent Neural Networks

TL;DR

This work demonstrates that GRU-based recurrent neural networks can learn the dynamic evolution of rate-and-state friction by training on synthetic RSF data generated from aging or slip laws. A physics-informed loss, built with automatic differentiation, enforces key frictional behaviors such as the direct velocity effect and healing, enabling the network to predict friction variations under velocity jumps without explicit state variables. Results show that the GRU captures RSF-driven friction dynamics with mean test errors around 12% in noiseless data and around 17% on noisy data (median errors substantially lower), indicating robust performance even with limited training data. The approach points toward integrating data-driven friction models into larger-scale simulations, while acknowledging limitations in healing modeling and the need for additional physics-informed losses and real experimental data to improve generalization. Overall, the study highlights the potential of data-driven, history-aware neural models to complement or replace phenomenological RSF formulations in simulations of frictional interfaces.

Abstract

In this concise contribution, it is demonstrated that Recurrent Neural Networks (RNNs) based on Gated Recurrent Unit (GRU) architecture, possess the capability to learn the complex dynamics of rate-and-state friction (RSF) laws from synthetic data. The data employed for training the network is generated through the application of traditional RSF equations coupled with either the aging law or the slip law for state evolution. A novel aspect of this approach is the formulation of a loss function that explicitly accounts for the direct effect by means of automatic differentiation. It is found that the GRU-based RNNs effectively learns to predict changes in the friction coefficient resulting from velocity jumps (with and without noise in the target data), thereby showcasing the potential of machine learning models in capturing and simulating the physics of frictional processes. Current limitations and challenges are discussed.
Paper Structure (16 sections, 17 equations, 7 figures, 6 tables)

This paper contains 16 sections, 17 equations, 7 figures, 6 tables.

Figures (7)

  • Figure 1: (a)(b)
  • Figure 2: Results sampler. Results correspond to network with hidden state size $|\boldsymbol{h}| = 10$, trained with 105 sequences, batch size equal to 1, $\lambda_1 = 1$, $\lambda_2 = \lambda_3 = 0.1$ and $\lambda_4 = 0.01$, $\lambda_5 = 0$ for noiseless data and $\lambda_5 = 10^{-4}$ for noisy. (a) Training loss evolution: most of the loss magnitude corresponds to data tracking in supervised learning. The other terms (inset) become relevant at the end of the process, once parameters had been adjusted to minimize (\ref{['eq:MAE']}). (b) Error bins (150 test sequences): for noiseless data, the average error is 12%, while the median error is just 4%; conversely, for noisy data the average is 17% and the median 11%. The error for noisy is computed using the underlying noiseless data. (c) Test example #1 (noiseless): light right axis for loading velocity, dark left axis for friction coefficient evolution. The dashed background line represents the velocity protocol. Emphasis on gradient magnitude (computed with autodiff): the top inset is aligned with the horizontal normalized time axis ($T$ represents the duration of the experiment), it reveals that most change (intense gradients) happen during the hold and at velocity jumps. (d) Test example #2 (noisy): similar axes and background as in (c). Emphasis on how the network is able to predict the underlying evolution despite the presence of noise in the training data.
  • Figure 3: Why does the network fail to predict? Showcasing two representative pathological cases. Results correspond to network with hidden state size $|\boldsymbol{h}| = 100$, trained with 700 sequences, batch size equal to 5, $\lambda_1 = 1$, $\lambda_2 = \lambda_3 = 0.1$ and $\lambda_4 = 0.01$, $\lambda_5 = 0$ for noiseless data and $\lambda_5 = 10^{-4}$ for noisy. Left: trong gradient at the beginning of the sequence. The initial velocity is much larger than the one at the next timestep, what leads not only to a large jump magnitude but also to a conflict with the initial condition used to stabilize initial gradients, \ref{['eq:loss3']}. Right: friction variations subsumed by noise. In this case, the velocity protocol always features relatively low velocities, so much so that the magnitude of the friction changes associated to the velocity jumps are considerably smaller than the amplitude of the noise.
  • Figure A: Best performance aging.
  • Figure B: Worst performance aging.
  • ...and 2 more figures