Table of Contents
Fetching ...

DINeuro: Distilling Knowledge from 2D Natural Images via Deformable Tubular Transferring Strategy for 3D Neuron Reconstruction

Yik San Cheng, Runkai Zhao, Heng Wang, Hanchuan Peng, Yui Lo, Yuqian Chen, Lauren J. O'Donnell, Weidong Cai

TL;DR

The paper tackles the challenge of accurately reconstructing 3D neuron morphology from noisy light-microscopy data. It proposes DINeuro, a framework that distills knowledge from a 2D pre-trained Vision Transformer (DINO) and transfers it to a 3D ViT through a deformable tubular transferring strategy to capture tree-like neuronal structures. Key contributions include a 2D-to-3D weight inflation method and a deformable tubular adaptation that yields multi-view tubular kernels along the three axes, improving segmentation and subsequent reconstruction. On the Janelia BigNeuron dataset, DINeuro achieves notable gains in mean Dice ($+4.53 ext{pp}$) and mean Hd95 ($-3.56 ext{pp}$), demonstrating the practical impact of cross-domain morphological priors for 3D neuron analysis.

Abstract

Reconstructing neuron morphology from 3D light microscope imaging data is critical to aid neuroscientists in analyzing brain networks and neuroanatomy. With the boost from deep learning techniques, a variety of learning-based segmentation models have been developed to enhance the signal-to-noise ratio of raw neuron images as a pre-processing step in the reconstruction workflow. However, most existing models directly encode the latent representative features of volumetric neuron data but neglect their intrinsic morphological knowledge. To address this limitation, we design a novel framework that distills the prior knowledge from a 2D Vision Transformer pre-trained on extensive 2D natural images to facilitate neuronal morphological learning of our 3D Vision Transformer. To bridge the knowledge gap between the 2D natural image and 3D microscopic morphologic domains, we propose a deformable tubular transferring strategy that adapts the pre-trained 2D natural knowledge to the inherent tubular characteristics of neuronal structure in the latent embedding space. The experimental results on the Janelia dataset of the BigNeuron project demonstrate that our method achieves a segmentation performance improvement of 4.53% in mean Dice and 3.56% in mean 95% Hausdorff distance.

DINeuro: Distilling Knowledge from 2D Natural Images via Deformable Tubular Transferring Strategy for 3D Neuron Reconstruction

TL;DR

The paper tackles the challenge of accurately reconstructing 3D neuron morphology from noisy light-microscopy data. It proposes DINeuro, a framework that distills knowledge from a 2D pre-trained Vision Transformer (DINO) and transfers it to a 3D ViT through a deformable tubular transferring strategy to capture tree-like neuronal structures. Key contributions include a 2D-to-3D weight inflation method and a deformable tubular adaptation that yields multi-view tubular kernels along the three axes, improving segmentation and subsequent reconstruction. On the Janelia BigNeuron dataset, DINeuro achieves notable gains in mean Dice () and mean Hd95 (), demonstrating the practical impact of cross-domain morphological priors for 3D neuron analysis.

Abstract

Reconstructing neuron morphology from 3D light microscope imaging data is critical to aid neuroscientists in analyzing brain networks and neuroanatomy. With the boost from deep learning techniques, a variety of learning-based segmentation models have been developed to enhance the signal-to-noise ratio of raw neuron images as a pre-processing step in the reconstruction workflow. However, most existing models directly encode the latent representative features of volumetric neuron data but neglect their intrinsic morphological knowledge. To address this limitation, we design a novel framework that distills the prior knowledge from a 2D Vision Transformer pre-trained on extensive 2D natural images to facilitate neuronal morphological learning of our 3D Vision Transformer. To bridge the knowledge gap between the 2D natural image and 3D microscopic morphologic domains, we propose a deformable tubular transferring strategy that adapts the pre-trained 2D natural knowledge to the inherent tubular characteristics of neuronal structure in the latent embedding space. The experimental results on the Janelia dataset of the BigNeuron project demonstrate that our method achieves a segmentation performance improvement of 4.53% in mean Dice and 3.56% in mean 95% Hausdorff distance.

Paper Structure

This paper contains 12 sections, 3 figures, 2 tables.

Figures (3)

  • Figure 1: The overview of our proposed framework, DINeuro. The 3D neuron image is initially partitioned into multiple 3D blocks, which are forwarded to a 3D ViT for block-to-slice segmentation. The 3D ViT is initialized through our deformable tubular transferring strategy, which dynamically aligns the pre-trained 2D kernel into a 3D deformable tubular kernel to better capture the tree-shape features of a 3D neuron image. The segmented slices are then stacked to produce the final segmentation, prior to applying a tracing method to reconstruct the overall neuron structure.
  • Figure 2: Illustration of the fusion of extracted multi-view features along the z, y, and x directions. The 2D convolutional kernel is adapted and shifted to 3D deformable tubular kernels in three directions, which are used to capture the multi-view geometric information of 3D neurons.
  • Figure 3: Illustration of the segmentation results (top) and corresponding neuron reconstruction (bottom) through SmartTracing algorithm for a sample 3D neuron image. The '+' symbol indicates the application of SmartTracing to the segmented outputs from the model.