Table of Contents
Fetching ...

Flow Matching Meets Biology and Life Science: A Survey

Zihao Li, Zhichen Zeng, Xiao Lin, Feihao Fang, Yanru Qu, Zhe Xu, Zhining Liu, Xuying Ning, Tianxin Wei, Ge Liu, Hanghang Tong, Jingrui He

TL;DR

This paper presents the first comprehensive survey of recent developments in flow matching and its applications in biological domains, and categorizes its applications into three major areas: biological sequence modeling, molecule generation and design, and peptide and protein generation.

Abstract

Over the past decade, advances in generative modeling, such as generative adversarial networks, masked autoencoders, and diffusion models, have significantly transformed biological research and discovery, enabling breakthroughs in molecule design, protein generation, catalysis discovery, drug discovery, and beyond. At the same time, biological applications have served as valuable testbeds for evaluating the capabilities of generative models. Recently, flow matching has emerged as a powerful and efficient alternative to diffusion-based generative modeling, with growing interest in its application to problems in biology and life sciences. This paper presents the first comprehensive survey of recent developments in flow matching and its applications in biological domains. We begin by systematically reviewing the foundations and variants of flow matching, and then categorize its applications into three major areas: biological sequence modeling, molecule generation and design, and peptide and protein generation. For each, we provide an in-depth review of recent progress. We also summarize commonly used datasets and software tools, and conclude with a discussion of potential future directions. The corresponding curated resources are available at https://github.com/Violet24K/Awesome-Flow-Matching-Meets-Biology.

Flow Matching Meets Biology and Life Science: A Survey

TL;DR

This paper presents the first comprehensive survey of recent developments in flow matching and its applications in biological domains, and categorizes its applications into three major areas: biological sequence modeling, molecule generation and design, and peptide and protein generation.

Abstract

Over the past decade, advances in generative modeling, such as generative adversarial networks, masked autoencoders, and diffusion models, have significantly transformed biological research and discovery, enabling breakthroughs in molecule design, protein generation, catalysis discovery, drug discovery, and beyond. At the same time, biological applications have served as valuable testbeds for evaluating the capabilities of generative models. Recently, flow matching has emerged as a powerful and efficient alternative to diffusion-based generative modeling, with growing interest in its application to problems in biology and life sciences. This paper presents the first comprehensive survey of recent developments in flow matching and its applications in biological domains. We begin by systematically reviewing the foundations and variants of flow matching, and then categorize its applications into three major areas: biological sequence modeling, molecule generation and design, and peptide and protein generation. For each, we provide an in-depth review of recent progress. We also summarize commonly used datasets and software tools, and conclude with a discussion of potential future directions. The corresponding curated resources are available at https://github.com/Violet24K/Awesome-Flow-Matching-Meets-Biology.

Paper Structure

This paper contains 35 sections, 50 equations, 5 figures, 6 tables.

Figures (5)

  • Figure 1: Flow Matching Meets Biological and Life Sciences. Flow matching serves as a powerful generative modeling paradigm for a wide range of biological and life science applications. Conversely, these domains offer rich and diverse tasks for evaluating and advancing flow matching techniques. In this survey, we first present state-of-the-art flow matching models and their variants, then categorize their applications into four major areas: sequence modeling, molecule generation, protein design, and other emerging biological applications. The corresponding curated resources are available at https://github.com/Violet24K/Awesome-Flow-Matching-Meets-Biology.
  • Figure 2: Trend of published papers on flow matching (FM) and its applications in biology and life sciences across major ML conferences from 2023 to 2025. The blue line indicates the total number of FM papers, while the orange line shows the subset focused on biological applications. Annotations highlight key milestones in FM and its adoption for molecule, sequence, and protein generation, illustrating the rapid growth and expanding interest in this area.
  • Figure 3: Overview of the survey taxonomy. We begin by introducing the foundations of flow matching, including its core models and variants. Our taxonomy then categorizes flow matching applications into major biological domains and tasks.
  • Figure 4: 2D graph representations of example molecules generated from the GEOM-Drugs axelrod2022geom (left two) and QM9 ramakrishnan2014quantum (right two) datasets. Each molecule is visualized as a 2D graph, where atoms are nodes and chemical bonds are edges, capturing both structural and topological properties.
  • Figure 5: 3D graph representations of example molecules generated from the GEOM-Drugs axelrod2022geom (left two) and QM9 ramakrishnan2014quantum (right two) datasets. Atoms are shown as nodes positioned in 3D Euclidean space, and bonds are represented as edges connecting them. These visualizations capture spatial geometry and stereochemistry important for molecular property prediction.