Table of Contents
Fetching ...

Physics-Informed Computer Vision: A Review and Perspectives

Chayan Banerjee, Kien Nguyen, Clinton Fookes, George Karniadakis

TL;DR

This paper surveys physics-informed computer vision (PICV), outlining how fundamental physical laws and priors can be embedded into vision models to improve robustness, data efficiency, and physical plausibility. It introduces a unified taxonomy spanning physics-informed ML (PIML) and CV-specific priors, categorizing priors, and mapping their incorporation across the standard CV pipeline. The review covers task groups from imaging inverse problems and super-resolution to generation, analysis, predictive modeling, and human-centric tasks, illustrating concrete methods such as PINNs, physics-based losses, and physics-conditioned generative models. Quantitative insights show notable gains in PSNR, IoU, and RMSE across diverse domains, with open questions on benchmarking, priors selection, uncertainty, and interpretability. The work highlights cross-domain synergies and provides a roadmap for future PICV research to enhance physical consistency, generalization, and performance under limited data or challenging conditions.

Abstract

The incorporation of physical information in machine learning frameworks is opening and transforming many application domains. Here the learning process is augmented through the induction of fundamental knowledge and governing physical laws. In this work, we explore their utility for computer vision tasks in interpreting and understanding visual data. We present a systematic literature review of more than 250 papers on formulation and approaches to computer vision tasks guided by physical laws. We begin by decomposing the popular computer vision pipeline into a taxonomy of stages and investigate approaches to incorporate governing physical equations in each stage. Existing approaches in computer vision tasks are analyzed with regard to what governing physical processes are modeled and formulated, and how they are incorporated, i.e. modification of input data (observation bias), modification of network architectures (inductive bias), and modification of training losses (learning bias). The taxonomy offers a unified view of the application of the physics-informed capability, highlighting where physics-informed learning has been conducted and where the gaps and opportunities are. Finally, we highlight open problems and challenges to inform future research. While still in its early days, the study of physics-informed computer vision has the promise to develop better computer vision models that can improve physical plausibility, accuracy, data efficiency, and generalization in increasingly realistic applications.

Physics-Informed Computer Vision: A Review and Perspectives

TL;DR

This paper surveys physics-informed computer vision (PICV), outlining how fundamental physical laws and priors can be embedded into vision models to improve robustness, data efficiency, and physical plausibility. It introduces a unified taxonomy spanning physics-informed ML (PIML) and CV-specific priors, categorizing priors, and mapping their incorporation across the standard CV pipeline. The review covers task groups from imaging inverse problems and super-resolution to generation, analysis, predictive modeling, and human-centric tasks, illustrating concrete methods such as PINNs, physics-based losses, and physics-conditioned generative models. Quantitative insights show notable gains in PSNR, IoU, and RMSE across diverse domains, with open questions on benchmarking, priors selection, uncertainty, and interpretability. The work highlights cross-domain synergies and provides a roadmap for future PICV research to enhance physical consistency, generalization, and performance under limited data or challenging conditions.

Abstract

The incorporation of physical information in machine learning frameworks is opening and transforming many application domains. Here the learning process is augmented through the induction of fundamental knowledge and governing physical laws. In this work, we explore their utility for computer vision tasks in interpreting and understanding visual data. We present a systematic literature review of more than 250 papers on formulation and approaches to computer vision tasks guided by physical laws. We begin by decomposing the popular computer vision pipeline into a taxonomy of stages and investigate approaches to incorporate governing physical equations in each stage. Existing approaches in computer vision tasks are analyzed with regard to what governing physical processes are modeled and formulated, and how they are incorporated, i.e. modification of input data (observation bias), modification of network architectures (inductive bias), and modification of training losses (learning bias). The taxonomy offers a unified view of the application of the physics-informed capability, highlighting where physics-informed learning has been conducted and where the gaps and opportunities are. Finally, we highlight open problems and challenges to inform future research. While still in its early days, the study of physics-informed computer vision has the promise to develop better computer vision models that can improve physical plausibility, accuracy, data efficiency, and generalization in increasingly realistic applications.
Paper Structure (31 sections, 10 figures, 3 tables)

This paper contains 31 sections, 10 figures, 3 tables.

Figures (10)

  • Figure 1: (a) Timeline of PICV papers published over the last eight years, where the histogram presents an exponentially increasing trend, (b) Application domains of recent PICV papers. The most applied domain is computational imaging and photonics, closely followed by medical imaging.
  • Figure 2: A simplified illustrative example of physics incorporation in a computer vision task, adapted from behera2021pidlnet. Physics information, in the form of flow data, is extracted from video sequences and incorporated into an aggregating network (PIDLNet).
  • Figure 3: Different physics prior examples. For Governing eqns. and constraints type priors (a) PDE as physics prior gao2021super; here a PDE loss is used to complement traditional network training and (c) Physics model as physics prior, yuan2022physdiff; here a physics simulator is used for motion projection for generating physically-plausible human motions, (b) Physics via historical data yao2023physics; here historical trajectory data is used by deep network to derive physics insights and data-driven features, (d) Physics information as visual representation lutjens2020physics; here a GAN pipeline ingests flood maps as physics prior along-with pre-flood satellite images generating photorealistic post-flood images, (e) Physics information as statistical property guo2023dynamic; here using speckle redundancy, the speckles from different configurations are described by different sub-regions of speckles from a single configuration. Such pre-processed speckle pattern is fed to NN post-processing module for object reconstruction, (f) Physics information as physical variable monakhova2022dancing; here a generative noise model (UNet) is based on physical noise parameters, where these parameters are based on prior knowledge of random variable distributions which can approximately model these noise types.
  • Figure 4: (a) The stacked histogram presents the statistics of a certain type of physics prior in a specific CV task, (b) The pie chart presents the research share of PI approaches in different CV tasks. Task "Image analysis" constitutes classification and segmentation tasks.
  • Figure 5: Examples of physics incorporation with regard to the CV pipeline (a) Physics incorporation after data acquisition monakhova2019learned; in this imaging task the physics prior in the form of a physics system model is introduced to the custom NN after data acquisition, (b) Physics incorporation during image pre-processing chen2021physics; in this temperature field generation task, the physical process module directly generates a motion field from input images and function (F) learns dynamic characteristics of the motion field, (c) Physics incorporation at model design (feature extraction) stage isogawa2020optical; in this human analysis task, custom network (P2PSF net) is designed to extract transient feature from images, to model physically-consistent 3D human pose, (d) Physics incorporation at model design (architecture selection/ customization) stage wu2018seeing, here in the PI extension of a regular CNN network, physical parameters are included during training for faster permeability prediction, (e) Physics incorporation at model training stage kissas2020machine, in this prediction task (f) Shows end-to-end pipeline of a robot motion planning, which is also a CV prediction task, with the inference or end product being the path solution. The approach uses a physics-driven objective function and reflects it through the architecture to parameterize the PDE (Eikonal equation) and generate time fields for different scenarios.
  • ...and 5 more figures