Table of Contents
Fetching ...

Clustering Molecular Energy Landscapes by Adaptive Network Embedding

Paula Mercurio, Di Liu

TL;DR

A data-driven approach for clustering potential energy landscapes of molecular structures by applying recently developed Network Embedding techniques to obtain latent variables defined through the embedding function, which can interpret dynamical node-node relationships in reduced dimensions.

Abstract

In order to efficiently explore the chemical space of all possible small molecules, a common approach is to compress the dimension of the system to facilitate downstream machine learning tasks. Towards this end, we present a data driven approach for clustering potential energy landscapes of molecular structures by applying recently developed Network Embedding techniques, to obtain latent variables defined through the embedding function. To scale up the method, we also incorporate an entropy sensitive adaptive scheme for hierarchical sampling of the energy landscape, based on Metadynamics and Transition Path Theory. By taking into account the kinetic information implied by a system's energy landscape, we are able to interpret dynamical node-node relationships in reduced dimensions. We demonstrate the framework through Lennard-Jones (LJ) clusters and a human DNA sequence.

Clustering Molecular Energy Landscapes by Adaptive Network Embedding

TL;DR

A data-driven approach for clustering potential energy landscapes of molecular structures by applying recently developed Network Embedding techniques to obtain latent variables defined through the embedding function, which can interpret dynamical node-node relationships in reduced dimensions.

Abstract

In order to efficiently explore the chemical space of all possible small molecules, a common approach is to compress the dimension of the system to facilitate downstream machine learning tasks. Towards this end, we present a data driven approach for clustering potential energy landscapes of molecular structures by applying recently developed Network Embedding techniques, to obtain latent variables defined through the embedding function. To scale up the method, we also incorporate an entropy sensitive adaptive scheme for hierarchical sampling of the energy landscape, based on Metadynamics and Transition Path Theory. By taking into account the kinetic information implied by a system's energy landscape, we are able to interpret dynamical node-node relationships in reduced dimensions. We demonstrate the framework through Lennard-Jones (LJ) clusters and a human DNA sequence.
Paper Structure (20 sections, 13 equations, 10 figures, 1 table)

This paper contains 20 sections, 13 equations, 10 figures, 1 table.

Figures (10)

  • Figure 1: Disconnectivity tree and Metadynamics based embeddings for the Lennard-Jones cluster with 8 atoms. Left: Disconnectivity tree of all local minima. Right: Embeddings for the local minima after applying the Metadynamics adjustment. Color scheme represents the potential energy, e.g., dark blue denotes the lowest, and red as the highest. Closely related minima have very similar or identical embeddings, e.g., both yellow minima are embedded at the yellow point on the right.
  • Figure 2: The 8-atom LJ network of local minima. Edge lengths are proportional to commute times. Node colors are chosen to match those in Figures \ref{['fig:lj8V']}.
  • Figure 3: Disconnectivity tree for the 38-atom LJ cluster. The structures of the two lowest-energy configurations are also pictured.
  • Figure 4: Hierarchical embeddings for the LJ cluster with 38 atoms. Pictured are the embeddings before (left) and after (right) applying Metadynamics. Color scheme denotes commute time from the global minimum, with dark blue being shortest distances, and red as furthest distances.
  • Figure 5: Embeddings of the local minima of the 38-atom LJ cluster with potential energies less than -170.9. The figure shows output of 2 level embeddings with Metadynamics adjustment. Color scheme denotes commute time from the global minimum.
  • ...and 5 more figures