Table of Contents
Fetching ...

Molecular topological deep learning for polymer property prediction

Cong Shen, Yipeng Zhang, Fei Han, Kelin Xia

TL;DR

The Mol-TDL incorporates both high-order interactions and multiscale properties into topological deep learning architecture and represents polymer molecules as a series of simplicial complices at different scales and build up simplical neural networks accordingly.

Abstract

Accurate and efficient prediction of polymer properties is of key importance for polymer design. Traditional experimental tools and density function theory (DFT)-based simulations for polymer property evaluation, are both expensive and time-consuming. Recently, a gigantic amount of graph-based molecular models have emerged and demonstrated huge potential in molecular data analysis. Even with the great progresses, these models tend to ignore the high-order and mutliscale information within the data. In this paper, we develop molecular topological deep learning (Mol-TDL) for polymer property analysis. Our Mol-TDL incorporates both high-order interactions and multiscale properties into topological deep learning architecture. The key idea is to represent polymer molecules as a series of simplicial complices at different scales and build up simplical neural networks accordingly. The aggregated information from different scales provides a more accurate prediction of polymer molecular properties.

Molecular topological deep learning for polymer property prediction

TL;DR

The Mol-TDL incorporates both high-order interactions and multiscale properties into topological deep learning architecture and represents polymer molecules as a series of simplicial complices at different scales and build up simplical neural networks accordingly.

Abstract

Accurate and efficient prediction of polymer properties is of key importance for polymer design. Traditional experimental tools and density function theory (DFT)-based simulations for polymer property evaluation, are both expensive and time-consuming. Recently, a gigantic amount of graph-based molecular models have emerged and demonstrated huge potential in molecular data analysis. Even with the great progresses, these models tend to ignore the high-order and mutliscale information within the data. In this paper, we develop molecular topological deep learning (Mol-TDL) for polymer property analysis. Our Mol-TDL incorporates both high-order interactions and multiscale properties into topological deep learning architecture. The key idea is to represent polymer molecules as a series of simplicial complices at different scales and build up simplical neural networks accordingly. The aggregated information from different scales provides a more accurate prediction of polymer molecular properties.
Paper Structure (19 sections, 21 equations, 7 figures, 11 tables)

This paper contains 19 sections, 21 equations, 7 figures, 11 tables.

Figures (7)

  • Figure 1: Flowchart of Mol-TDL model. A Filtration of the Vietoris–Rips Complex for Cycloheptatriene. The Vietoris–Rips complex is applied at various cut-off distances: 2.0, 2.5, 3.0, 3.5, and 4.0 Å in our model. B Topological deep learning model which utilizes a simplicial complex derived from the Vietoris–Rips complex with a cut-off distance. The neighborhood structure is constructed through interactions among 0, 1, and 2-simplexes. Message Passing (MP) in the model is implemented based on this neighborhood structure, followed by pooling operation and a Multilayer Perceptron (MLP) for regression analysis.
  • Figure 1: Scatter plots of predicted values by Mol-TDL for three datasets: $E_{gc}$, $E_{gb}$, $E_{ea}$. The dashed lines on diagonals stand for perfect regression.
  • Figure 2: Scatter plots of predicted values by Mol-TDL for ten datasets: $E_i$, $X_c$, $\varepsilon_0$, $n_c$, $E_{gap}^{crystal}$, $E_{gap}^{chain}$, $\mathrm{\Phi}_e^{BC}$. The dashed lines on diagonals stand for perfect regression.
  • Figure 3: Visualization of the representations learned by different models. A. Visualization of Mol-TDL and polyBERT latent representations of different scaffolds (indicated by different colors). B. Visualization of Mol-TDL and polyBERT latent representations for $E_{gb}$, $E_{ea}$ and $E_{gap}^{chain}$.
  • Figure 4: The t-SNE visualization results with Mol-TDL and PolyBERT. The number in bracket indicates DBI and the smaller the number, the better the clustering effect. The different colors indicate different scaffolds.
  • ...and 2 more figures