Table of Contents
Fetching ...

NAR-*ICP: Neural Execution of Classical ICP-based Pointcloud Registration Algorithms

Efimia Panagiotaki, Daniele De Martini, Lars Kunze, Paul Newman, Petar Veličković

TL;DR

This work presents NAR-*ICP, a Graph Neural Network framework that learns to execute ICP-based point-cloud registration algorithms by mimicking their intermediate steps within the Neural Algorithmic Reasoning paradigm. By encoding the registration process as a sequence of graph-structured intermediate states, the model can learn to converge without requiring perfect initial alignments, aided by a learned termination mechanism and a ground-truth optimisation step. Empirical results on synthetic and real data (SemanticKITTI) show that NAR-*ICP often surpasses the original algorithms and competitive learned baselines in accuracy and runtime, while maintaining interpretability via algorithmic reasoning. The approach promises robust integration into learning pipelines and potential extensions to broader robotics tasks such as navigation and manipulation, enabling more transparent, reliable, and efficient perception systems.

Abstract

This study explores the intersection of neural networks and classical robotics algorithms through the Neural Algorithmic Reasoning (NAR) blueprint, enabling the training of neural networks to reason like classical robotics algorithms by learning to execute them. Algorithms are integral to robotics and safety-critical applications due to their predictable and consistent performance through logical and mathematical principles. In contrast, while neural networks are highly adaptable, handling complex, high-dimensional data and generalising across tasks, they often lack interpretability and transparency in their internal computations. To bridge the two, we propose a novel Graph Neural Network (GNN)-based framework, NAR-*ICP, that learns the intermediate computations of classical ICP-based registration algorithms, extending the CLRS Benchmark. We evaluate our approach across real-world and synthetic datasets, demonstrating its flexibility in handling complex inputs, and its potential to be used within larger learning pipelines. Our method achieves superior performance compared to the baselines, even surpassing the algorithms it was trained on, further demonstrating its ability to generalise beyond the capabilities of traditional algorithms.

NAR-*ICP: Neural Execution of Classical ICP-based Pointcloud Registration Algorithms

TL;DR

This work presents NAR-*ICP, a Graph Neural Network framework that learns to execute ICP-based point-cloud registration algorithms by mimicking their intermediate steps within the Neural Algorithmic Reasoning paradigm. By encoding the registration process as a sequence of graph-structured intermediate states, the model can learn to converge without requiring perfect initial alignments, aided by a learned termination mechanism and a ground-truth optimisation step. Empirical results on synthetic and real data (SemanticKITTI) show that NAR-*ICP often surpasses the original algorithms and competitive learned baselines in accuracy and runtime, while maintaining interpretability via algorithmic reasoning. The approach promises robust integration into learning pipelines and potential extensions to broader robotics tasks such as navigation and manipulation, enabling more transparent, reliable, and efficient perception systems.

Abstract

This study explores the intersection of neural networks and classical robotics algorithms through the Neural Algorithmic Reasoning (NAR) blueprint, enabling the training of neural networks to reason like classical robotics algorithms by learning to execute them. Algorithms are integral to robotics and safety-critical applications due to their predictable and consistent performance through logical and mathematical principles. In contrast, while neural networks are highly adaptable, handling complex, high-dimensional data and generalising across tasks, they often lack interpretability and transparency in their internal computations. To bridge the two, we propose a novel Graph Neural Network (GNN)-based framework, NAR-*ICP, that learns the intermediate computations of classical ICP-based registration algorithms, extending the CLRS Benchmark. We evaluate our approach across real-world and synthetic datasets, demonstrating its flexibility in handling complex inputs, and its potential to be used within larger learning pipelines. Our method achieves superior performance compared to the baselines, even surpassing the algorithms it was trained on, further demonstrating its ability to generalise beyond the capabilities of traditional algorithms.

Paper Structure

This paper contains 32 sections, 8 equations, 7 figures, 16 tables, 2 algorithms.

Figures (7)

  • Figure 1: Learning process: Each intermediate algorithmic output is converted into a graph, $G^{(t)} = (V^{(t)}, E^{(t)}, x_i^{(t)}, e_{ij}^{(t)}, g_k^{(t)})$, before being passed to an encoder-processor-decoder model. The encoder generates embedding representations from the input features, $Z^{(t)}$, which are then used in a Triplet to generate latent features $H^{(t)}$. These features are passed to a decoder that predicts the features, $\hat{y}^{(t)}$, that effectively correspond to the output of the algorithm at step $t$. The process is repeated for all intermediate steps of the algorithm. The model additionally learns a phase, $\hat{p}^{(t)}$, and a termination, $\hat{s}^{(t)}$, flag, as independent binary classes, predicting the different phases of the algorithm -- finding correspondences and estimating relative transformation and error -- and its final step. During inference, when the termination flag is triggered, the neural algorithmic execution terminates at $\hat{y}^{(T)}$. Additionally, we leverage the ground truth from each input dataset as a final training signal to optimise the model's output $\hat{y}^{(T+1)}$, on the right.
  • Figure 2: At each algorithmic step, we encode the input features, $x_i^{(t)}, x_j^{(t)}, e_{ij}^{(t)},$ and $g_k^{(t)}$, and use the latent features from the previous step of the processor to generate the current step's latent representations. In this context, $h_i^{(t)}$ and $h_j^{(t)}$ correspond to the node features, while $h_{ij}^{(t)}$ denotes the edge features, which result from the aggregation of node and graph encodings. The loss is calculated between the decoder's output $\hat{y}^{(t)}$ and the output of the algorithm $y^{(t)}$ at each iteration. Our method uses the ground truth from the input dataset $y_{gt}$ as an additional optimisation step $t=T+1$.
  • Figure 3: Visualisation of the intermediate steps output for (a) P2P-ICP and (b) ours, NAR-P2Pv2. Despite being trained on the latter, NAR-P2Pv2 demonstrates superior registration performance, achieving better point cloud alignment and finding more accurate correspondences. To simplify the visualisation, we use the predictions for the $\textcolor{myblue}{\mathtt{phase}}$ hint to identify the intermediate algorithmic components and display those from the second phase of the algorithm.
  • Figure 4: Error distributions of the predicted transformed point clouds during the neural execution at each intermediate step across all benchmarks, in MSE$^\textbf{t}$$(\downarrow)$ (outliers capped at $0.25$ for clarity), median error, and IQR.
  • Figure 5: Ablation study of different model architectures, comparing the MSE$^{\textbf{t}}(\downarrow)$ error distributions of intermediate steps predictions.
  • ...and 2 more figures