Optimal navigation of magnetic artificial microswimmers in blood capillaries with deep reinforcement learning
Lucas Amoudruz, Sergey Litvinov, Petros Koumoutsakos
TL;DR
The paper addresses targeted navigation of magnetic artificial microswimmers in realistic capillary networks by combining a detailed RBC-based blood model with a reduced-order environment for training a reinforcement learning policy. An actor-critic off-policy algorithm (V-RACER) learns a control policy for ABF steering, using a reduced-order model dx/dt = $\mathbf{u}(\mathbf{x}) + U \mathbf{p} + \sqrt{D}\boldsymbol{\xi}$ and a navigation reward that favors progress toward a target. The learned policy transfers robustly to fine-grained DPD blood simulations, where the magnetic field is adjusted to align with the policy-predicted direction, yielding trajectories and travel times comparable to the ROM. This approach offers a computationally efficient route to robust, personalized guidance of magnetic microswimmers in complex vasculature, with potential for targeted drug delivery and microsurgery.
Abstract
Biomedical applications such as targeted drug delivery, microsurgery, and sensing rely on reaching precise areas within the body in a minimally invasive way. Artificial bacterial flagella (ABFs) have emerged as potential tools for this task by navigating through the circulatory system with the help of external magnetic fields. While their swimming characteristics are well understood in simple settings, their controlled navigation through realistic capillary networks remains a significant challenge due to the complexity of blood flow and the high computational cost of detailed simulations. We address this challenge by conducting numerical simulations of ABFs in retinal capillaries, propelled by an external magnetic field. The simulations are based on a validated blood model that predicts the dynamics of individual red blood cells and their hydrodynamic interactions with ABFs. The magnetic field follows a control policy that brings the ABF to a prescribed target. The control policy is learned with an actor-critic, off-policy reinforcement learning algorithm coupled with a reduced-order model of the system. We show that the same policy robustly guides the ABF to a prescribed target in both the reduced-order model and the fine-grained blood simulations. This approach is suitable for designing robust control policies for personalized medicine at moderate computational cost.
