Table of Contents
Fetching ...

Theoretical Data-Driven MobilePosenet: Lightweight Neural Network for Accurate Calibration-Free 5-DOF Magnet Localization

Wenxuan Xie, Yuelin Zhang, Jiwei Shan, Hongzhe Sun, Jiewen Tan, Shing Shin Cheng

TL;DR

This work tackles real-time, calibration-free 5-DOF magnet localization for wireless capsule endoscopy by replacing data-intensive real-world training with theoretical-data–driven learning. It introduces MobilePosenet, a lightweight end-to-end network built on inverted residual attention blocks and depthwise separable convolutions, augmented with sensor-coordinate inputs and random noise to bridge theory and practice. Trained solely on the magnetic-dipole model, it achieves high localization accuracy ($1.54\pm1.03$ mm, $2.24\pm1.84^{\circ}$) with low latency ($0.91$ ms on GPU) and shows robustness against calibration requirements that hinder LM-based and other data-driven approaches. The approach promises rapid deployment across clinical settings by eliminating hardware calibration and real-world data collection, while maintaining competitive or superior performance relative to state-of-the-art methods.

Abstract

Permanent magnet tracking using the external sensor array is crucial for the accurate localization of wireless capsule endoscope robots. Traditional tracking algorithms, based on the magnetic dipole model and Levenberg-Marquardt (LM) algorithm, face challenges related to computational delays and the need for initial position estimation. More recently proposed neural network-based approaches often require extensive hardware calibration and real-world data collection, which are time-consuming and labor-intensive. To address these challenges, we propose MobilePosenet, a lightweight neural network architecture that leverages depthwise separable convolutions to minimize computational cost and a channel attention mechanism to enhance localization accuracy. Besides, the inputs to the network integrate the sensors' coordinate information and random noise, compensating for the discrepancies between the theoretical model and the actual magnetic fields and thus allowing MobilePosenet to be trained entirely on theoretical data. Experimental evaluations conducted in a \(90 \times 90 \times 80\) mm workspace demonstrate that MobilePosenet exhibits excellent 5-DOF localization accuracy ($1.54 \pm 1.03$ mm and $2.24 \pm 1.84^{\circ}$) and inference speed (0.9 ms) against state-of-the-art methods trained on real-world data. Since network training relies solely on theoretical data, MobilePosenet can eliminate the hardware calibration and real-world data collection process, improving the generalizability of this permanent magnet localization method and the potential for rapid adoption in different clinical settings.

Theoretical Data-Driven MobilePosenet: Lightweight Neural Network for Accurate Calibration-Free 5-DOF Magnet Localization

TL;DR

This work tackles real-time, calibration-free 5-DOF magnet localization for wireless capsule endoscopy by replacing data-intensive real-world training with theoretical-data–driven learning. It introduces MobilePosenet, a lightweight end-to-end network built on inverted residual attention blocks and depthwise separable convolutions, augmented with sensor-coordinate inputs and random noise to bridge theory and practice. Trained solely on the magnetic-dipole model, it achieves high localization accuracy ( mm, ) with low latency ( ms on GPU) and shows robustness against calibration requirements that hinder LM-based and other data-driven approaches. The approach promises rapid deployment across clinical settings by eliminating hardware calibration and real-world data collection, while maintaining competitive or superior performance relative to state-of-the-art methods.

Abstract

Permanent magnet tracking using the external sensor array is crucial for the accurate localization of wireless capsule endoscope robots. Traditional tracking algorithms, based on the magnetic dipole model and Levenberg-Marquardt (LM) algorithm, face challenges related to computational delays and the need for initial position estimation. More recently proposed neural network-based approaches often require extensive hardware calibration and real-world data collection, which are time-consuming and labor-intensive. To address these challenges, we propose MobilePosenet, a lightweight neural network architecture that leverages depthwise separable convolutions to minimize computational cost and a channel attention mechanism to enhance localization accuracy. Besides, the inputs to the network integrate the sensors' coordinate information and random noise, compensating for the discrepancies between the theoretical model and the actual magnetic fields and thus allowing MobilePosenet to be trained entirely on theoretical data. Experimental evaluations conducted in a mm workspace demonstrate that MobilePosenet exhibits excellent 5-DOF localization accuracy ( mm and ) and inference speed (0.9 ms) against state-of-the-art methods trained on real-world data. Since network training relies solely on theoretical data, MobilePosenet can eliminate the hardware calibration and real-world data collection process, improving the generalizability of this permanent magnet localization method and the potential for rapid adoption in different clinical settings.
Paper Structure (15 sections, 10 equations, 5 figures, 5 tables)

This paper contains 15 sections, 10 equations, 5 figures, 5 tables.

Figures (5)

  • Figure 1: Coordinate system for magnet’s localization: ${{O}_{1}}$ is the central position of the magnet, ${{H}_{0}}$ is the direction of the magnet. ${{R}_{i}}$ is the distance between the $i-th$ sensor and ${{O}_{1}}$, $l$ is magnet length and $d$ is magnet diameter.
  • Figure 2: Structure and Workflow of MobilePosenet: The network performs end-to-end predictions of the magnet's pose. The input consists of sensor readings and coordinates, while the output consists of the magnet's position and orientation. In the network architecture, ConvBnRelu represents a composite operation that includes a Conv2d layer, Batch Normalization, and the ReLU activation function. IRAB denotes an Inverted Residual Attention Block. AdaptiveAvgPool refers to an adaptive average pooling layer. Dwise indicates depthwise separable convolution. SEBlock (Squeeze-and-Excitation Block) is a channel attention mechanism that enhances feature extraction by reweighting the input channels.
  • Figure 3: The experimental platform comprises a triaxial magnetometer array, a calibration board, a plastic magnet shell with a height of 10 mm and an inner diameter of 10 mm, and a cylindrical N35 permanent magnet with a diameter and height of 10 mm ($B_T = 8.18 \times 10^{-2}$). The calibration board measures 100 $\times$ 100 mm, with 15 mm between adjacent holes. The magnetometer array consists of 16 triaxial magnetometers arranged in a $4 \times 4$ configuration.
  • Figure 4: Six magnet orientations correspond to the boundary values $[-1, 1]$ of the orientation vector $(m, n, p)$.
  • Figure 5: Positioning errors at various heights. The height refers to the distance between the permanent magnet center and the sensor plane.