Table of Contents
Fetching ...

Signal Intensity-weighted coordinate channels improve learning stability and generalisation in 1D and 2D CNNs in localisation tasks on biomedical signals

Vittal L. Rao

TL;DR

This work addresses the difficulty of coordinate regression in CNNs for biomedical localisation by introducing intensity-weighted coordinate channels, which explicitly couple spatial/temporal position with local signal intensity. The authors demonstrate that this simple input-level inductive bias leads to faster convergence and better generalisation than standard CoordConv on both 2D Pap-smear nuclear localisation and 1D ECG changepoint tasks, using LakshyaNet and NimeshaNet architectures. They validate the approach with extensive cross-validation, bootstrapped significance tests, and ablations including reduced-parameter variants, showing robust improvements across modalities and network sizes. The approach is lightweight, modality-agnostic, and publicly available, with potential broad impact on landmark localisation in biomedical imaging and signal analysis.

Abstract

Localisation tasks in biomedical data often require models to learn meaningful spatial or temporal relationships from signals with complex intensity distributions. A common strategy, exemplified by CoordConv layers, is to append coordinate channels to convolutional inputs, enabling networks to learn absolute positions. In this work, we propose a signal intensity-weighted coordinate representation that replaces the pure coordinate channels with channels scaled by local signal intensity. This modification embeds an intensity-position coupling directly in the input representation, introducing a simple and modality-agnostic inductive bias. We evaluate the approach on two distinct localisation problems: (i) predicting the time of morphological transition in 20-second, two-lead ECG signals, and (ii) regressing the coordinates of nuclear centres in cytological images from the SiPaKMeD dataset. In both cases, the proposed representation yields faster convergence and higher generalisation performance relative to conventional coordinate-channel approaches, demonstrating its effectiveness across both one-dimensional and two-dimensional biomedical signals.

Signal Intensity-weighted coordinate channels improve learning stability and generalisation in 1D and 2D CNNs in localisation tasks on biomedical signals

TL;DR

This work addresses the difficulty of coordinate regression in CNNs for biomedical localisation by introducing intensity-weighted coordinate channels, which explicitly couple spatial/temporal position with local signal intensity. The authors demonstrate that this simple input-level inductive bias leads to faster convergence and better generalisation than standard CoordConv on both 2D Pap-smear nuclear localisation and 1D ECG changepoint tasks, using LakshyaNet and NimeshaNet architectures. They validate the approach with extensive cross-validation, bootstrapped significance tests, and ablations including reduced-parameter variants, showing robust improvements across modalities and network sizes. The approach is lightweight, modality-agnostic, and publicly available, with potential broad impact on landmark localisation in biomedical imaging and signal analysis.

Abstract

Localisation tasks in biomedical data often require models to learn meaningful spatial or temporal relationships from signals with complex intensity distributions. A common strategy, exemplified by CoordConv layers, is to append coordinate channels to convolutional inputs, enabling networks to learn absolute positions. In this work, we propose a signal intensity-weighted coordinate representation that replaces the pure coordinate channels with channels scaled by local signal intensity. This modification embeds an intensity-position coupling directly in the input representation, introducing a simple and modality-agnostic inductive bias. We evaluate the approach on two distinct localisation problems: (i) predicting the time of morphological transition in 20-second, two-lead ECG signals, and (ii) regressing the coordinates of nuclear centres in cytological images from the SiPaKMeD dataset. In both cases, the proposed representation yields faster convergence and higher generalisation performance relative to conventional coordinate-channel approaches, demonstrating its effectiveness across both one-dimensional and two-dimensional biomedical signals.

Paper Structure

This paper contains 29 sections, 3 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Illustration of the LakshyaNet architecture. The network takes a $5 \times 256 \times 256$ input tensor (RGB channels plus two coordinate channels) and passes it through four convolutional blocks with batch normalization and ReLU activations. Spatial dimensions are reduced using average pooling in the first three blocks. A learnable depthwise convolution layer (kernel size $32 \times 32$) aggregates global spatial information, followed by a fully connected layer that predicts the ellipse center coordinates $(h, k)$ scaled to the image range [0, 255].
  • Figure 2: Average train and test R2 scores across epochs for the reduced-parameter variant of our model (with additional average pooling after Conv Block 4).
  • Figure 3: Sample images from the nuclear localisation task and plots from the ECG changepoint detection task with predictions and ground truth demonstrating the improvement in accuracy in the proposed method