Table of Contents
Fetching ...

Unifying Theories in High-Dimensional Biology: Approaches, Challenges and Opportunities

Marianne Bauer, Akshit Goyal, Sidhartha Goyal, Gautam Reddy, Shaon Chakrabarti, Michael M Desai, William Gilpin, Jacopo Grilli, Kabir Husain, Sanjay Jain, Mohit Kumar Jolly, Kyogo Kawaguchi, Aneta Koseska, Milo Lin, Leelavati Narlikar, Simone Pigolotti, Archishman Raju, Krishna Shrinivas, Rahul Siddharthan, Greg J Stephens, Andreas Tiffeau-Mayer, Suriyanarayanan Vaikuntanathan

TL;DR

The paper surveys a spectrum of perspectives on how to describe, predict, and unify high-dimensional biological systems, from low-dimensional, energy-landscape and information-theoretic views to high-dimensional statistical approaches and data-driven AI methods. It argues that robust progress will likely come from a principled blend of effective low-dimensional descriptions (sloppy landscapes, latent spaces, coarse-graining) and high-dimensional analyses that preserve functional diversity and adaptability across scales. Key themes include the role of phase transitions and condensates in cellular organization, the importance of dynamical systems and memory in development, and cross-disciplinary insights from ecology, immunology, and evolution. The work outlines concrete directions for quantifying dimensionality, developing shared mathematical language, and transferring ideas across molecular, cellular, and ecosystem scales to achieve a unifying theory of high-dimensional biology.

Abstract

Across biological subdisciplines, the last decade has seen an explosion of high-dimensional datasets, including datasets for cells, species, immune systems, neurons and behaviour. At the ICTS workshop 'Unifying Theories in High-Dimensional Biophysics' we discussed whether this high dimensionality poses a challenge or opportunity for describing, understanding and predicting biological systems theoretically. We discussed methods, models and frameworks that can help with addressing empirical observations based on these high-dimensional datasets. We summarize the challenges and opportunities that emerged in discussions according to individual participants below.

Unifying Theories in High-Dimensional Biology: Approaches, Challenges and Opportunities

TL;DR

The paper surveys a spectrum of perspectives on how to describe, predict, and unify high-dimensional biological systems, from low-dimensional, energy-landscape and information-theoretic views to high-dimensional statistical approaches and data-driven AI methods. It argues that robust progress will likely come from a principled blend of effective low-dimensional descriptions (sloppy landscapes, latent spaces, coarse-graining) and high-dimensional analyses that preserve functional diversity and adaptability across scales. Key themes include the role of phase transitions and condensates in cellular organization, the importance of dynamical systems and memory in development, and cross-disciplinary insights from ecology, immunology, and evolution. The work outlines concrete directions for quantifying dimensionality, developing shared mathematical language, and transferring ideas across molecular, cellular, and ecosystem scales to achieve a unifying theory of high-dimensional biology.

Abstract

Across biological subdisciplines, the last decade has seen an explosion of high-dimensional datasets, including datasets for cells, species, immune systems, neurons and behaviour. At the ICTS workshop 'Unifying Theories in High-Dimensional Biophysics' we discussed whether this high dimensionality poses a challenge or opportunity for describing, understanding and predicting biological systems theoretically. We discussed methods, models and frameworks that can help with addressing empirical observations based on these high-dimensional datasets. We summarize the challenges and opportunities that emerged in discussions according to individual participants below.

Paper Structure

This paper contains 29 sections, 4 figures.

Figures (4)

  • Figure 1: Sloppy landscapes of optimal gene regulation. a) Genes are expressed in the spatially complex geometry of the nucleus, based on multiple genomic regions upstream and/or downstream of the gene that can regulate a single gene together as well as multiple genes. b) Even in the simplest possible model where these regulatory regions activate expression based on threshold concentrations $\theta$ of an input, the optimal positioning of these thresholds that maximizes transfer of developmental information $I(Z_{\{\theta_1,\theta_2\}};x)$ is 'sloppy’: c) This implies that the eigenvalues $\lambda$ of the Hessian matrix at the optimum span several decades (adapted from Ref Baueretal).
  • Figure 2: Waddington landscapes constitute a low dimensional representation of cell fate decisions. The figures shows two distinct potential landscapes that could putatively describe the differentiation of a cell into two alternative types. These constitute different geometrical possibilities for the low dimensional behaviour. A) There is a direct path from the central attractor into the two alternative attractors. B) On exiting the central attractor, the flow goes towards the second saddle point and is subsequently diverted towards one of the attractors. Adapted from Ref. raju2024geometrical.
  • Figure 3: Contrastive learning can reveal generalizable similarity rules in the complex sequence to function map of immune receptors and their ligands. Cartoons of A) a binding energy landscape describing T cell receptor specificity to different peptide ligands and B) the co-variance of this landscape with receptor sequence similarity. By 'aligning' the peaks in the landscape the co-variance structure is expected to generalize to a greater extent across ligands as illustrated by the marginal distributions (insets). C) Contrastive learning optimizes receptor similarity metrics to minimize the distance between receptors which are co-specific to a common ligand, while encouraging uniform use of the representation space. [Adapted from Ref. pyo2025data].
  • Figure 4: Low-dimensional dynamics in the behavior of C. elegans. We show a 3D projection of embedded worm posture trajectories illustrating forward (blue) and backward (red) locomotion through approximately orthogonal collections of cycles, a simplification of the full 7D state space ahamed2021capturing. Cartoon arrows are a reminder that the geometry of this space carries fundamental dynamical information, such as the Lyapunov exponents. Figure credit: Tosif Ahamed.