Deciphering Molecular Charge Anisotropy: the Case of Antibody Solutions

Fabrizio Camerin; Susana Marin-Aguilar; Anna Stradner; Peter Schurtenberger; Emanuela Zaccarelli

Deciphering Molecular Charge Anisotropy: the Case of Antibody Solutions

Fabrizio Camerin, Susana Marin-Aguilar, Anna Stradner, Peter Schurtenberger, Emanuela Zaccarelli

Abstract

Electrostatic interactions fundamentally govern the structure, stability, and dynamics of charged (bio)matter, yet the impact of heterogeneous and anisotropic charge distributions on the behavior of protein solutions remains elusive. Here, we introduce a versatile multiscale framework that directly connects molecular-level electrostatics to collective properties via a colloid-inspired coarse-grained modeling combined with neural network-assisted optimization. Using monoclonal antibodies as model system, our inverse design approach identifies charge patterns capable of reliably reproducing experimental structure factors, osmotic compressibility and collective diffusion coefficients in a wide region of protein concentrations. Close inspection of our data further uncovers how specific physical features and spatial arrangements of localized charge patches significantly influence the solution structure. This transferable strategy provides a predictive pathway to decode and control charge-driven interactions in complex biomolecules and, more generally, in heterogeneously-charged soft matter systems, with immediate relevance to protein formulation and biomaterials engineering.

Deciphering Molecular Charge Anisotropy: the Case of Antibody Solutions

Abstract

Paper Structure (6 sections, 13 equations, 12 figures, 3 tables)

This paper contains 6 sections, 13 equations, 12 figures, 3 tables.

Numbering coarse-grained model
Effect on varying parameters on charges
Features of the initial training dataset
Complete list of MSE for each optimization cycle
Static structure factors for the best combinations overall
Complete list of features used for the SHAP algorithm

Figures (12)

Figure 1: Modeling strategy. (a) Amino acid representation of the antibody under investigation. The color coding reflects the charge of each amino acid. (b) Electrostatic isopotential surfaces for the same antibody as in (a). The color coding reflect the potential of the surface at $\pm 1k_BT/e$. (c) Provisional 18-bead coarse-grained model for the antibody in which two 9-bead layers are over-imposed onto each other. Blue beads show the (likely) charge assignment given the representation in (b), while grey beads remain undetermined given the complex arrangement of charges in the corresponding regions. In all cases, panels I and II provide the visualization of the same antibody from two different perspectives, highlighting the two sides of the antibody, side $A$ and $B$, respectively, as indicated by the reported axes.
Figure 2: Varying parameters. (Left) Tips $t$, middle $m$, tips and middle $tm$, and spread $s$ representative charge distributions on side $A$ of the antibody model that vary depending on the position of the negative charges (red beads). Side $B$ is kept fixed in all cases and it is not shown in the Figure. The color coding refers to the charge $Z$ assigned to each bead. (Right) Static structure factor $S(q)$ as a function of the scattering vector $q$ in units of $\sigma^{-1}$, with $\sigma$ the bead size, for (a) varying the total charge $Q$, (b) varying the charge difference $\Delta Q$ and (c) the charge distribution $\mathcal{Q}$, keeping fixed the other parameters, for $c=20$ mg/ml. Indicated in the legends is the nominal value of the parameters, while the actual one may slightly change (see Results and Methods). As a reference, circles indicate the experimental $S(q)$ for $c=20$ mg/ml, taken from Ref. camerin2026beyond.
Figure 3: Schematics of the optimization protocol. (a) The static structure factors $S(q)$ for two concentrations $c_1=20$ mg/ml and $c_2=100$ mg/ml obtained from a training dataset are concatenated into an array $\mathbf{S}$. Each of the rows of this array is linked to a unique charge distribution stored in another array $\mathbf{Q}$. (b) The experimental target data are arranged in an array $\mathbf{S}_{exp}$ with the same discrete scattering vectors $q$ as for $\mathbf{S}$. (c) Based on the training of the neural network performed in (a), and passing the experimental data from (b), several output charge distributions $\mathbf{Q}_{out}$ are obtained. The variability is mainly on side $A$ of the antibody since the charge distribution of side $B$ does not vary much in the training dataset. (d) Based on the physical features that are extracted from the output distributions and on the level of agreement with the target, (I) the initial training dataset can be enlarged including additional, closer-to-the-target $S(q)$s, and the overall process repeated in a new cycle, thus creating updated $\mathbf{S}$ and $\mathbf{Q}$ arrays; otherwise, (II) the protocol can be terminated.
Figure 4: Optimization cycles. For each of the three optimization cycles (I, II, III), we show the relevant data for (a) training phase, (b) output and (c) charge distribution candidates. In (a), we show the static structure factors $S(q)$ (grey lines) used in each cycle for the training dataset for (upper panels) $c_1=20$ mg/ml and (lower panels) for $c_2=100$ mg/ml. Symbols are for the respective experimental data, taken from Ref. camerin2026beyond. (b) The level of agreement of the output with experimental data is then measured for several different configurations (three are shown) by means of the mean squared error (MSE). (c) The corresponding charge distributions for side $A$ are displayed correspondingly while, for side $B$, variations are minimal. For each charge distribution, the values of $Q$ and $\Delta Q$ are also reported. Additional data are shown in the SI.
Figure 5: Best (and off) charge distributions. (a) Mean square error (MSE) and (b) static structure factors $S(q)$ as a function of the scattering vector $q$ for antibody concentrations $c=20$ and $100$ mg/ml, and (c) corresponding snapshots for the best five charge distributions, both from specifically-created configurations used for the training datasets and from output configurations of the NN. (d) Representative static structure factors $S(q)$ as a function of the scattering vector $q$ for an antibody concentration $c=20$ mg/ml, and (e) corresponding snapshots for representative charge configurations whose charge features have a non-satisfactory agreement with the experimental $S(q)$. Symbols in (b) and (d) are for experimental data, taken from Ref. camerin2026beyond.
...and 7 more figures

Deciphering Molecular Charge Anisotropy: the Case of Antibody Solutions

Abstract

Deciphering Molecular Charge Anisotropy: the Case of Antibody Solutions

Authors

Abstract

Table of Contents

Figures (12)