Table of Contents
Fetching ...

Analytical coarse grained potential parameterization by Reinforcement Learning for anisotropic cellulose

Xu Dong

Abstract

Cellulose nanocrystals (CNCs) are a type of cellulose with excellent mechanical performance and other merit attributes. According to previous reports, hydrogen bonds play a pivotal role in the anisotropic structure of the CNC. Understanding the structure and mechanical behavior of CNC on a mesoscopic scale is critical for the development and manufacture of cellulose materials. However, experimental observations and atomistic simulations are not appropriate on the mesoscopic scale. In this study, we introduce an analytical coarse-grained (CG) potential following an extended bottom-up approach that is directly parameterized using Reinforcement Learning (RL). RL is a powerful tool for industrial and academic applications in various fields. Nevertheless, the potential of RL has not yet been fully exploited in the field of molecular dynamics. The RL and Boltzmann inversion methods were employed to develop a novel CG model of cellulose to represent its anisotropy and polymer stiffness. The resultant CG model is not limited to the target properties for training, and can reproduce the dynamics mechanical properties under other circumstances without additional training. This model confirms that RL can construct a CG potential that is both physically explainable and powerful.

Analytical coarse grained potential parameterization by Reinforcement Learning for anisotropic cellulose

Abstract

Cellulose nanocrystals (CNCs) are a type of cellulose with excellent mechanical performance and other merit attributes. According to previous reports, hydrogen bonds play a pivotal role in the anisotropic structure of the CNC. Understanding the structure and mechanical behavior of CNC on a mesoscopic scale is critical for the development and manufacture of cellulose materials. However, experimental observations and atomistic simulations are not appropriate on the mesoscopic scale. In this study, we introduce an analytical coarse-grained (CG) potential following an extended bottom-up approach that is directly parameterized using Reinforcement Learning (RL). RL is a powerful tool for industrial and academic applications in various fields. Nevertheless, the potential of RL has not yet been fully exploited in the field of molecular dynamics. The RL and Boltzmann inversion methods were employed to develop a novel CG model of cellulose to represent its anisotropy and polymer stiffness. The resultant CG model is not limited to the target properties for training, and can reproduce the dynamics mechanical properties under other circumstances without additional training. This model confirms that RL can construct a CG potential that is both physically explainable and powerful.

Paper Structure

This paper contains 21 sections, 23 equations, 13 figures, 6 tables.

Figures (13)

  • Figure 1: Orthotropic transverse section and characteristic directions. (a) Characteristic directions of orthotropic transverse sections dominated by HBonding planes. (b) Behaviors in characteristic directions. Flat residues and HBonds are critical for cellulose anisotropy. The vertical and horizontal models are dominated by HBonds and van der Waals interactions, resulting in brittle failure. The load of the slant model induced friction and rotation of cellulose residues.
  • Figure 2: Mapping and topology. (a) Mapping and (b) mapped overlays for the cellulose residue and (c) laminar HBonding plane. One cellulose residue was mapped to one backbone CL1 bead and two branched CL2 and CL3 beads. The basic idea behind this mapping is to preserve the flat structure and HBonds within the HBonding layer. Qualitatively, the distance between CL2 and CL3 from the same residue was sufficient and the branched beads were expected to be perpendicular to the backbones. Under this mapping, a layer of CNC was mapped as a network structure, where the HBond interactions were stressed. (d) BD topology. BD interactions are required to preserve molecular structure and performance under stretching, bending, and twisting loads. (e) NB topology. For HBond (HB) potentials, two symmetric A:H-D relations (CL2:CL3-CL1 and CL3:CL2-CL1) were defined to describe HBonds. HB potentials were defined by using both the A:D distance and A:H-D angle.
  • Figure 3: RL diagrams. (a) Standard RL diagram. In the standard RL diagram, the agent interacts with the environment continuously and performs actions corresponding to observations. Its optimization target is to maximize the cumulative reward $R(\tau)=\sum_{t=0}^{\infty}\gamma^t r_t$. (b) One-shot RL diagram. However, a one-shot diagram was applied and only one episode was processed. In this study, RL was employed as a nonlinear optimizer, guided by its reward function.
  • Figure 4: Training procedures. The BD properties (axial elastic modulus and polymer stiffness) and NB properties (transverse strength and toughness in characteristic directions) were the target attributes. Before training, the equilibrium bonded geometry parameters and force constants of the BD harmonic potentials were estimated from the mapped AA trajectories using the Boltzmann inversion method. In the first stage, the force constants were scaled by the RL agent in the BD training to reproduce the BD properties. In BD training, all NB interactions were ignored because we assumed that the NB interactions were not significant in this case. Given both BD and NB properties, NB training using the previously obtained BD coefficients was performed to match both BD and NB properties. However, only one batch of simulations was performed for the target properties during training. Thus, the transverse stretch performance data were manually verified through replica simulations to confirm the statistical agreement. Multiple sets of coefficients were collected to reduce the number of independent coefficients for the computational cost and physical explainability. These results confirm that our assumption of the BD properties is reasonable. Therefore, the coupled training was unnecessary.
  • Figure 5: Training convergence and statistics-guided reduction. (a) Reward and its moving average in the first and (b) last training loops. Each training loop contained 1024 training steps. One trial took about 4 minutes and overall training took about 40 days. The subsequent training loops with higher and more stable rewards were based on the former training loops. (c) Average and maximum rewards for different training loops. The average and maximum rewards confirmed the convergence. After training, the best coefficients for the different training loops were collected to perform statistics-guided reduction. The first reduction was inferred from the geometric characteristics of the CG models, and a constraint of $\sigma_{p11}=\sigma_{p22}=\sigma_{p33}$ was applied (pmn represents a pair of CLm and CLn), which is also helpful for fragile behaviors in the vertical and horizontal directions. (d) The second reduction was in the distance coefficients using statistics. $\sigma$ except for $\sigma_{p23}$ oscillated around its equilibrium distance $d^e$ whereas an obvious increase of $\sigma_{p23}$ was observed. Then only $\sigma_{p11}$ and $\sigma_{p23}$ are kept as independent distance coefficients: $\sigma_{p12}=\frac{d^e_{p12}}{d^e_{p11}}\sigma_{p11}, \sigma_{p13}=\frac{d^e_{p13}}{d^e_{p11}}\sigma_{p11}$. (e) The third reduction in energy coefficients was statistically determined. After the first and second reduction, we observe that $\epsilon_{p23}$ is usually the smallest energy coefficients, which may correspond to the $\sigma{p23}$ enlargement. Then only $\epsilon_{p23}$ and $\epsilon_{p23}$ are kept as independent energy coefficient: $\epsilon_{p11}=\epsilon_{p22}=\epsilon_{p33}=\epsilon_{p12}=\epsilon_{p13}$.
  • ...and 8 more figures