Table of Contents
Fetching ...

Tract-RLFormer: A Tract-Specific RL policy based Decoder-only Transformer Network

Ankita Joshi, Ashutosh Sharma, Anoushkrit Goel, Ranjeet Ranjan Jha, Chirag Ahuja, Arnav Bhavsar, Aditya Nigam

TL;DR

This work proposes Tract-RLFormer, a network utilizing both supervised and reinforcement learning, in a two-stage policy refinement process that markedly improves the accuracy and generalizability across various data-sets.

Abstract

Fiber tractography is a cornerstone of neuroimaging, enabling the detailed mapping of the brain's white matter pathways through diffusion MRI. This is crucial for understanding brain connectivity and function, making it a valuable tool in neurological applications. Despite its importance, tractography faces challenges due to its complexity and susceptibility to false positives, misrepresenting vital pathways. To address these issues, recent strategies have shifted towards deep learning, utilizing supervised learning, which depends on precise ground truth, or reinforcement learning, which operates without it. In this work, we propose Tract-RLFormer, a network utilizing both supervised and reinforcement learning, in a two-stage policy refinement process that markedly improves the accuracy and generalizability across various data-sets. By employing a tract-specific approach, our network directly delineates the tracts of interest, bypassing the traditional segmentation process. Through rigorous validation on datasets such as TractoInferno, HCP, and ISMRM-2015, our methodology demonstrates a leap forward in tractography, showcasing its ability to accurately map the brain's white matter tracts.

Tract-RLFormer: A Tract-Specific RL policy based Decoder-only Transformer Network

TL;DR

This work proposes Tract-RLFormer, a network utilizing both supervised and reinforcement learning, in a two-stage policy refinement process that markedly improves the accuracy and generalizability across various data-sets.

Abstract

Fiber tractography is a cornerstone of neuroimaging, enabling the detailed mapping of the brain's white matter pathways through diffusion MRI. This is crucial for understanding brain connectivity and function, making it a valuable tool in neurological applications. Despite its importance, tractography faces challenges due to its complexity and susceptibility to false positives, misrepresenting vital pathways. To address these issues, recent strategies have shifted towards deep learning, utilizing supervised learning, which depends on precise ground truth, or reinforcement learning, which operates without it. In this work, we propose Tract-RLFormer, a network utilizing both supervised and reinforcement learning, in a two-stage policy refinement process that markedly improves the accuracy and generalizability across various data-sets. By employing a tract-specific approach, our network directly delineates the tracts of interest, bypassing the traditional segmentation process. Through rigorous validation on datasets such as TractoInferno, HCP, and ISMRM-2015, our methodology demonstrates a leap forward in tractography, showcasing its ability to accurately map the brain's white matter tracts.

Paper Structure

This paper contains 13 sections, 2 equations, 4 figures, 6 tables.

Figures (4)

  • Figure 1: Overview of the proposed Iterative Policy Learning for Tract-Specific Generation using DWI data. (a) An RL agent ($\pi_{\theta}$) interacts with the environment (E) to learn an optimal level-1 policy ($\pi_{\theta opt}$). (b) This policy is used to generate tract-specific roll-outs, denoted as 'experience replay'. (c) and (d) illustrate the offline, auto-regressive training of the proposed Tract-RLFormer $\phi$, referred to as T-RLF, over these roll-outs. In (c), T-RLF undergoes general pre-training, while in (d) it is fine-tuned to learn an optimal tract-specific policy ($\pi_{\phi opt}$). (e) shows the testing phase, where T-RLF, which has learned the new level-2 policy ($\pi_{\phi_{opt}}$), performs tracking in environment $E$ to produce the desired tract. Training and tracking steps are shown in yellow and orange backgrounds, respectively.
  • Figure 2: Data Representation for T-RLF: Tract specific policy refinement using a trajectory-based approach in an RL agent's experience space. The figure illustrates a $k$ length fiber streamline $f$ in human brain voxel space, represented as a trajectory $\tau = (R_0, s_0, a_0, R_1, s_1, a_1, ....., R_k, s_k, a_k)$. Each point in the streamline corresponds to a state, action, and return-to-go tuple at a time-step $t$.
  • Figure 3: Data Driven Policy Learning: Visual representation of training Tract-RLFormer for action prediction at time-step $t$, using context information from $K$ length fiber (Section \ref{['subsubsec:train-det']}). The input sequence tuples <$R$, $s$, $a$> are causally masked from $a_t$ onwards and processed through embedding layers $emb_R$, $emb_s$, and $emb_a$, with a learnable positional encoding layer ($PE$). Embeddings are processed by $L$ decoder blocks ($L=3$ for pre-training, $L=4$ for fine-tuning), incorporating Multi-Head Attention (MHA) and Multi-Layer Perceptron (MLP), to generate predicted action $\hat{a_t}$.
  • Figure 4: Visual comparison of reconstructed tracts illustrating (a): Intra-dataset variability, Inter-dataset variability, and (b): Variability across tracts reconstructed by different algorithms. The depicted tracts include the left PYT, CG, and a part of CC. The algorithms evaluated in bottom section of figure are T-RLF (ours), TD3, and PFT.