Table of Contents
Fetching ...

Turbulence Strength $C_n^2$ Estimation from Video using Physics-based Deep Learning

Ripon Kumar Saha, Esen Salcin, Jihoo Kim, Joseph Smith, Suren Jayasuriya

TL;DR

The paper tackles estimating path-averaged turbulence strength $C_n^2$ from passive RGB video by comparing classical gradient-based methods, a baseline CNN, and a physics-informed CNN. It introduces a differentiable gradient-based formulation and explicitly incorporates camera parameters to enhance generalization, validated on two open datasets with co-located scintillometer ground truth. Results show deep learning delivers strong interpolation accuracy but struggles with extrapolation and transfer, while the physics-based CNN achieves superior generalization and robustness across datasets. The work provides open data and code, demonstrating a practical path toward reliable turbulence sensing in long-range imaging and atmospheric scenarios.

Abstract

Images captured from a long distance suffer from dynamic image distortion due to turbulent flow of air cells with random temperatures, and thus refractive indices. This phenomenon, known as image dancing, is commonly characterized by its refractive-index structure constant $C_n^2$ as a measure of the turbulence strength. For many applications such as atmospheric forecast model, long-range/astronomy imaging, and aviation safety, optical communication technology, $C_n^2$ estimation is critical for accurately sensing the turbulent environment. Previous methods for $C_n^2$ estimation include estimation from meteorological data (temperature, relative humidity, wind shear, etc.) for single-point measurements, two-ended pathlength measurements from optical scintillometer for path-averaged $C_n^2$, and more recently estimating $C_n^2$ from passive video cameras for low cost and hardware complexity. In this paper, we present a comparative analysis of classical image gradient methods for $C_n^2$ estimation and modern deep learning-based methods leveraging convolutional neural networks. To enable this, we collect a dataset of video capture along with reference scintillometer measurements for ground truth, and we release this unique dataset to the scientific community. We observe that deep learning methods can achieve higher accuracy when trained on similar data, but suffer from generalization errors to other, unseen imagery as compared to classical methods. To overcome this trade-off, we present a novel physics-based network architecture that combines learned convolutional layers with a differentiable image gradient method that maintains high accuracy while being generalizable across image datasets.

Turbulence Strength $C_n^2$ Estimation from Video using Physics-based Deep Learning

TL;DR

The paper tackles estimating path-averaged turbulence strength from passive RGB video by comparing classical gradient-based methods, a baseline CNN, and a physics-informed CNN. It introduces a differentiable gradient-based formulation and explicitly incorporates camera parameters to enhance generalization, validated on two open datasets with co-located scintillometer ground truth. Results show deep learning delivers strong interpolation accuracy but struggles with extrapolation and transfer, while the physics-based CNN achieves superior generalization and robustness across datasets. The work provides open data and code, demonstrating a practical path toward reliable turbulence sensing in long-range imaging and atmospheric scenarios.

Abstract

Images captured from a long distance suffer from dynamic image distortion due to turbulent flow of air cells with random temperatures, and thus refractive indices. This phenomenon, known as image dancing, is commonly characterized by its refractive-index structure constant as a measure of the turbulence strength. For many applications such as atmospheric forecast model, long-range/astronomy imaging, and aviation safety, optical communication technology, estimation is critical for accurately sensing the turbulent environment. Previous methods for estimation include estimation from meteorological data (temperature, relative humidity, wind shear, etc.) for single-point measurements, two-ended pathlength measurements from optical scintillometer for path-averaged , and more recently estimating from passive video cameras for low cost and hardware complexity. In this paper, we present a comparative analysis of classical image gradient methods for estimation and modern deep learning-based methods leveraging convolutional neural networks. To enable this, we collect a dataset of video capture along with reference scintillometer measurements for ground truth, and we release this unique dataset to the scientific community. We observe that deep learning methods can achieve higher accuracy when trained on similar data, but suffer from generalization errors to other, unseen imagery as compared to classical methods. To overcome this trade-off, we present a novel physics-based network architecture that combines learned convolutional layers with a differentiable image gradient method that maintains high accuracy while being generalizable across image datasets.
Paper Structure (29 sections, 3 equations, 11 figures, 1 table)

This paper contains 29 sections, 3 equations, 11 figures, 1 table.

Figures (11)

  • Figure 1: (a) The scintillometer receiver is measuring $C_n^2$ values, and images are captured with a Nikon zoom camera and Celestron telescope. (b) The image captured by the camera shows two target boards and the scintillometer. This distortion and bending of lines in these captured images are due to atmospheric turbulence.
  • Figure 2: Overall deep learning architecture for $C_n^2$ estimation. Initially, all images are aligned together to remove high-frequency motion, and the region of interest was cropped out. Using an EfficientNetV2 inspired architecture, we estimate the $C_n^2$ value in supervised learning. Here, the gray blocks represent a Conv 3x3 layer with Batch normalization and Swish activation, the yellow block represents a Fused-MBconv1 architecture tan2021efficientnetv2 with 3x3 kernel, the blue block represents the Fused-MBconv6 architecture with 3x3 kernel, and the green block represents the Fused-MBconv6 architecture with 5x5 kernel. The Fused-MBconv architecture itself consists of a Conv3x3 layer with a Squeeze and Excitation(SE) Layer followed by a Conv1x1 layer with skip connections.
  • Figure 3: Physics-based deep learning to estimate $C_n^2$ from image sequences with different camera parameters. The green block is a single convolutional layer with a 5x5 kernel and depth 3, and the blue blocks are convolutional layers with a 5x5 kernel and depth 1. The output of the convolutional layers is then multiplied by the camera parameters and the image variance as specified in Equation \ref{['eq(2)']}.
  • Figure 4: Comparison of the three methods on simulated data with respect to camera motion. We report both $R^2$ and MAE for these methods.
  • Figure 5: Here the diameter was varied by 1 to 51.6x in simulation, to observe the effect of error by different approaches. Here gradient methods show the best accuracy, while the physics based CNN shows very close results, while the deep learning approach shows large errors representing its difficulty generalizing to different apertures. It also shows that the physics-based deep learning has better correlation while the deep learning method provides the worst generalization with negative correlation numbers.
  • ...and 6 more figures