Table of Contents
Fetching ...

SurgeMOD: Translating image-space tissue motions into vision-based surgical forces

Mikel De Iturrate Reyzabal, Dionysios Malas, Shuai Wang, Sebastien Ourselin, Hongbin Liu

TL;DR

This work tackles vision-based force estimation in minimally invasive robotic surgery by deriving an image-space motion basis from natural organ motions. It introduces a frequency-domain modal framework where motion textures are transformed into mode shapes via FFT and restricted to $K=20$ low-frequency components, enabling a dynamic-constraint formulation to infer forces directly from video. The method is validated on silicone phantom and ex-vivo porcine tissue, achieving strong alignment with force sensor readings and producing interpretable force textures in the image space. It provides a principled vision-based baseline for surgical force estimation and points to practical extensions, including depth cues and automated segmentation, to enhance haptic feedback in real-time procedures.

Abstract

We present a new approach for vision-based force estimation in Minimally Invasive Robotic Surgery based on frequency domain basis of motion of organs derived directly from video. Using internal movements generated by natural processes like breathing or the cardiac cycle, we infer the image-space basis of the motion on the frequency domain. As we are working with this representation, we discretize the problem to a limited amount of low-frequencies to build an image-space mechanical model of the environment. We use this pre-built model to define our force estimation problem as a dynamic constraint problem. We demonstrate that this method can estimate point contact forces reliably for silicone phantom and ex-vivo experiments, matching real readings from a force sensor. In addition, we perform qualitative experiments in which we synthesize coherent force textures from surgical videos over a certain region of interest selected by the user. Our method demonstrates good results for both quantitative and qualitative analysis, providing a good starting point for a purely vision-based method for surgical force estimation.

SurgeMOD: Translating image-space tissue motions into vision-based surgical forces

TL;DR

This work tackles vision-based force estimation in minimally invasive robotic surgery by deriving an image-space motion basis from natural organ motions. It introduces a frequency-domain modal framework where motion textures are transformed into mode shapes via FFT and restricted to low-frequency components, enabling a dynamic-constraint formulation to infer forces directly from video. The method is validated on silicone phantom and ex-vivo porcine tissue, achieving strong alignment with force sensor readings and producing interpretable force textures in the image space. It provides a principled vision-based baseline for surgical force estimation and points to practical extensions, including depth cues and automated segmentation, to enhance haptic feedback in real-time procedures.

Abstract

We present a new approach for vision-based force estimation in Minimally Invasive Robotic Surgery based on frequency domain basis of motion of organs derived directly from video. Using internal movements generated by natural processes like breathing or the cardiac cycle, we infer the image-space basis of the motion on the frequency domain. As we are working with this representation, we discretize the problem to a limited amount of low-frequencies to build an image-space mechanical model of the environment. We use this pre-built model to define our force estimation problem as a dynamic constraint problem. We demonstrate that this method can estimate point contact forces reliably for silicone phantom and ex-vivo experiments, matching real readings from a force sensor. In addition, we perform qualitative experiments in which we synthesize coherent force textures from surgical videos over a certain region of interest selected by the user. Our method demonstrates good results for both quantitative and qualitative analysis, providing a good starting point for a purely vision-based method for surgical force estimation.

Paper Structure

This paper contains 16 sections, 8 equations, 6 figures, 3 tables.

Figures (6)

  • Figure 1: The graph presents the mean amplitude of the frequency analysis of tissue motion on camera for every frequency over the time period $T$. We analyse all the positive frequencies of the spectrum $T/2$. The selected $K$ frequencies are marked using the pink triangular markers. From the graph, we can observe that the majority of the information from the motion generated by the natural cycles on the left is accumulated on the first $15-20$ frequencies.
  • Figure 2: Graphical diagram of our experimental setup. Our testing platform is either: silicone phantom or an ex-vivo porcine cardiac tissue. This is connected to a pressure regulator that controls the cyclic cardiac motion of the heart. We place the camera close enough to obtain a clear recording of the motion and a light source to create consistent lighting to reduce its effect on the optical flow prediction.
  • Figure 3: Diagram representation of our proposed force estimation algorithm. The force estimator runs in two different steps: calculation of the mode shape (left) and force estimation (right). The optical flow postprocessing helps to have a more uniform distributions of the power in the frequency domain, as it can be observed at Fig. \ref{['fig:power_spectrum']}
  • Figure 4: Representation of the different spaces in which forces are represented. On the left we have the 3D Cartesian space representation of the force. This is the force recorded by the sensor for our testing data. On the right, we have the projection of the 3D space into the camera coordinates u-v. The force is also projected into this image space and it is related to the Cartesian space through the intrinsic parameters of the camera.
  • Figure 5: Results of the force comparison between the predicted and measured contact force. This example presents a classic poking scenario and we can divide it into four different steps based on the force response: 1) rest, there is no contact; 2) push, the contact starts and increases in intensity; 3) lock, the moment on maximum force in which we keep the tool for a short period of time; and 4) release, the contact between the tool and the tissue is relaxed.
  • ...and 1 more figures