Table of Contents
Fetching ...

Fast tensor-based electrostatic energy calculations in the perspective of protein-ligand docking problem

Peter Benner, Boris N. Khoromskij, Venera Khoromskaia, Matthias Stein

TL;DR

This work addresses the bottleneck of computing electrostatic energies in protein-ligand docking for large biomolecular systems. It introduces a range-separated (RS) tensor framework that represents the free-space electrostatic potential on large 3D grids and achieves near-linear $O(n)$ complexity for energy evaluation, with the number of particles $N$ exerting only a logarithmic influence. By decomposing the potential into long-range and short-range components and precomputing a low-rank CP representation of the long-range part, the method reduces energy and gradient evaluations for arbitrary ligand positions to $O(R N_L)$ per configuration, independent of protein size. Numerical experiments on synthetic data and moderate-size biomolecules demonstrate accurate energy assessments and the feasibility of blind docking searches using only electrostatic contributions, suggesting this approach can augment existing stochastic or deterministic docking schemes with substantial speedups.

Abstract

We propose and justify a new approach for fast calculation of the electrostatic interaction energy of clusters of charged particles in constrained energy minimization in the framework of rigid protein-ligand docking. Our ``blind search'' docking technique is based on the low-rank range-separated (RS) tensor-based representation of the free-space electrostatic potential of the biomolecule represented on large $n\times n\times n$ 3D grid. We show that both the collective electrostatic potential of a complex protein-ligand system and the respective electrostatic interaction energy can be calculated by tensor techniques in $O(n)$-complexity, such that the numerical cost for energy calculation only mildly (logarithmically) depends on the number of particles in the system. Moreover, tensor representation of the electrostatic potential enables usage of large 3D Cartesian grids (of the order of $n^3 \sim 10^{12}$), which could allow the accurate modeling of complexes with several large proteins. In our approach selection of the correct geometric pose predictions in the localized posing process is based on the control of van der Waals distance between the target molecular clusters. Here, we confine ourselves by constrained minimization of the energy functional by using only fast tensor-based free-space electrostatic energy recalculation for various rotations and translations of both clusters. Numerical tests of the electrostatic energy-based ``protein-ligand docking'' algorithm applied to synthetic and realistic input data present a proof of concept for rather complex particle configurations. The method may be used in the framework of the traditional stochastic or deterministic posing/docking techniques.

Fast tensor-based electrostatic energy calculations in the perspective of protein-ligand docking problem

TL;DR

This work addresses the bottleneck of computing electrostatic energies in protein-ligand docking for large biomolecular systems. It introduces a range-separated (RS) tensor framework that represents the free-space electrostatic potential on large 3D grids and achieves near-linear complexity for energy evaluation, with the number of particles exerting only a logarithmic influence. By decomposing the potential into long-range and short-range components and precomputing a low-rank CP representation of the long-range part, the method reduces energy and gradient evaluations for arbitrary ligand positions to per configuration, independent of protein size. Numerical experiments on synthetic data and moderate-size biomolecules demonstrate accurate energy assessments and the feasibility of blind docking searches using only electrostatic contributions, suggesting this approach can augment existing stochastic or deterministic docking schemes with substantial speedups.

Abstract

We propose and justify a new approach for fast calculation of the electrostatic interaction energy of clusters of charged particles in constrained energy minimization in the framework of rigid protein-ligand docking. Our ``blind search'' docking technique is based on the low-rank range-separated (RS) tensor-based representation of the free-space electrostatic potential of the biomolecule represented on large 3D grid. We show that both the collective electrostatic potential of a complex protein-ligand system and the respective electrostatic interaction energy can be calculated by tensor techniques in -complexity, such that the numerical cost for energy calculation only mildly (logarithmically) depends on the number of particles in the system. Moreover, tensor representation of the electrostatic potential enables usage of large 3D Cartesian grids (of the order of ), which could allow the accurate modeling of complexes with several large proteins. In our approach selection of the correct geometric pose predictions in the localized posing process is based on the control of van der Waals distance between the target molecular clusters. Here, we confine ourselves by constrained minimization of the energy functional by using only fast tensor-based free-space electrostatic energy recalculation for various rotations and translations of both clusters. Numerical tests of the electrostatic energy-based ``protein-ligand docking'' algorithm applied to synthetic and realistic input data present a proof of concept for rather complex particle configurations. The method may be used in the framework of the traditional stochastic or deterministic posing/docking techniques.

Paper Structure

This paper contains 14 sections, 4 theorems, 45 equations, 12 figures, 2 tables.

Key Result

Theorem 2.2

(Uniform rank bounds for the long-range part BKK_RS:18). Let the long-range part ${\bf P}_l$ in the total interaction potential, see (eqn:Long-Range_Sum), correspond to the sinc-approximation for generating radial function $p(\|x\|)$ with $K=O(\log^2\varepsilon)$, see (eqn:sinc). Then the total $\va where the constant $C$ does not depend on the number of particles $N$.

Figures (12)

  • Figure 2.1: The long-range (left) and the short-range (right) parts of the tensor representation of the Newton kernel in x-axis.
  • Figure 2.2: The collective electrostatic potential of a cluster with 782 charged particles (left), its short-range (middle) and the long-range (right) parts.
  • Figure 3.1: Scheme of active summation indexes in (\ref{['eqn:EnergyLatSum_mixed']}), (blue rectangle), where $N=M+L$.
  • Figure 3.2: Bounding box $\Pi$ for the small ligand and the containing hypercube $\Omega \supset \Pi$.
  • Figure 3.3: Schematic illustration of the layer-like enveloping ring where the energy optimization has to be implemented.
  • ...and 7 more figures

Theorems & Definitions (7)

  • Definition 2.1
  • Theorem 2.2
  • Remark 3.1
  • Proposition 3.2
  • Lemma 3.3
  • Remark 3.4
  • Lemma 5.1