Table of Contents
Fetching ...

Joint Geometric-Chemical Distance for Protein Surfaces

Himanshu Swami, John M. McBride, Jean-Pierre Eckmann, Tsvi Tlusty

TL;DR

IFACE (Intrinsic Field-Aligned Coupled Embedding), a correspondence-based framework that aligns protein surfaces through probabilistic coupling of intrinsic geometry with spatially distributed chemical fields, derives a joint geometric--chemical distance that integrates structural and physicochemical discrepancies within a single formulation.

Abstract

Protein function is executed at the molecular surface, where shape and chemistry act together to govern interaction. Yet most comparison methods treat these aspects separately, privileging either global fold or local descriptors and missing their coupled organization. Here we introduce IFACE (Intrinsic Field-Aligned Coupled Embedding), a correspondence-based framework that aligns protein surfaces through probabilistic coupling of intrinsic geometry with spatially distributed chemical fields. From this alignment, we derive a joint geometric--chemical distance that integrates structural and physicochemical discrepancies within a single formulation. Across diverse proteins, this distance separates conformational variability from true structural divergence more effectively than fold-based similarity measures. Applied to the cytochrome P450 family, it reveals coherent family-level organization and identifies conserved buried catalytic pockets despite the complex topology. By linking interpretable surface correspondences with a unified distance, IFACE establishes a principled basis for comparing protein interfaces and detecting functionally related interaction patches across proteins.

Joint Geometric-Chemical Distance for Protein Surfaces

TL;DR

IFACE (Intrinsic Field-Aligned Coupled Embedding), a correspondence-based framework that aligns protein surfaces through probabilistic coupling of intrinsic geometry with spatially distributed chemical fields, derives a joint geometric--chemical distance that integrates structural and physicochemical discrepancies within a single formulation.

Abstract

Protein function is executed at the molecular surface, where shape and chemistry act together to govern interaction. Yet most comparison methods treat these aspects separately, privileging either global fold or local descriptors and missing their coupled organization. Here we introduce IFACE (Intrinsic Field-Aligned Coupled Embedding), a correspondence-based framework that aligns protein surfaces through probabilistic coupling of intrinsic geometry with spatially distributed chemical fields. From this alignment, we derive a joint geometric--chemical distance that integrates structural and physicochemical discrepancies within a single formulation. Across diverse proteins, this distance separates conformational variability from true structural divergence more effectively than fold-based similarity measures. Applied to the cytochrome P450 family, it reveals coherent family-level organization and identifies conserved buried catalytic pockets despite the complex topology. By linking interpretable surface correspondences with a unified distance, IFACE establishes a principled basis for comparing protein interfaces and detecting functionally related interaction patches across proteins.
Paper Structure (12 sections, 42 equations, 6 figures, 3 tables)

This paper contains 12 sections, 42 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: Conceptual workflow of IFACE. (a) Protein surfaces $S_{\alpha}$ and $S_{\beta}$ are represented using geometry and surface feature fields (electrostatics, hydrogen-bond propensity, hydrophobicity, and curvature). (b) An optimal coupling matrix $P_{ij}$ is computed by balancing structural and feature-field similarity. (c) The coupling defines bidirectional soft correspondences between surfaces. (d) Correspondence-based structural and feature-field distances are computed. (e) Distances are normalized across the dataset; their combination defines the IFACE and chemical distances derived from the structural distance and M feature-field distances.
  • Figure 2: Comparison and mapping of protein surfaces. Protein 6XRX is shown in the top row and 6XDS in the bottom row. Left: ribbon representations of the protein structures. Middle: electrostatic potential distributions (color scale clipped to the 5th–95th percentile of the combined distribution), where similar colors indicate regions with comparable electrostatic potential. Right: IFACE color mappings showing correspondences from 6XRX to 6XDS. The correspondences (vertex mapping) are computed using combined structural and feature-field similarity, allowing direct comparison of corresponding surface regions.
  • Figure 3: Comparison of TM--distance and surface-based distances for separating conformers of the same protein from distinct proteins. (a) Distance distributions for TM (normalized [0, 1]), structural, chemical, and IFACE distances, comparing intra-protein conformers and inter-protein comparisons across four proteins (6XRX, 5HZ7, 2XZ3, and 6XDS). (b) Receiver operating characteristic (ROC) curves showing classification performance of each distance metric. (c) Precision--recall curves with the random baseline indicated by a dashed line. (d) Distance matrices grouped by protein. Surface-based distances produce clearer block-diagonal structure than TM--distance, indicating stronger intra-protein similarity and inter-protein separation.
  • Figure 4: Comparison of proteins 1JPZ and 1TQN. Top row: ribbon representations of chain A from 1JPZ (left) and 1TQN (right), with the heme prosthetic group shown in red. Bottom left: patch around the heme on 1JPZ (red), containing the heme group responsible for substrate oxidation via oxygen activation. This patch lies within the protein interior and is accessed through substrate tunnels, illustrating non-trivial internal surface geometry. Bottom right: corresponding mapped region on 1TQN (blue), demonstrating that the method identifies a similar buried pocket across proteins.
  • Figure 5: Surface-distance-based characterization of protein pairs involving cytochrome P450s. (a) Distance distributions for structural, chemical, and IFACE distances, comparing same-family (P450–P450) and different-family (P450–non-P450) protein pairs. (b) Receiver operating characteristic (ROC) curves showing family-level classification performance. (c) Precision--recall curves with the random baseline indicated by a dashed line. (d) Distance matrices for structural, chemical, and IFACE distances. P450 proteins form a coherent block separated from histone, hemoglobin, and cell-maintenance proteins, demonstrating consistent family-level separation.
  • ...and 1 more figures