Table of Contents
Fetching ...

Distilling Knowledge for Designing Computational Imaging Systems

Leon Suarez-Rodriguez, Roman Jacome, Henry Arguello

TL;DR

The paper tackles the problem that end-to-end optimization of computational imaging systems is hampered by physical encoder constraints and gradient issues. It introduces a knowledge distillation (KD) framework in which a less-constrained, pretrained teacher guides a constrained student across three stages: teacher creation via relaxation, teacher optimization, and knowledge transfer through two losses that align encoders and decoders. Across MRI, SPC, and SD-CASSI, the KD approach yields higher reconstruction quality (up to ~1.53 dB PSNR gains) and improved encoder designs (lower condition numbers, reduced Gram matrix coherence, and better spectral sampling) compared to E2E baselines, with robustness to noise. The method is generalizable to other CI modalities and tasks, offering a practical alternative to traditional E2E design while maintaining feasible inference. The KD framework thus provides a principled, flexible pathway to design physically feasible, high-performance CI systems.

Abstract

Designing the physical encoder is crucial for accurate image reconstruction in computational imaging (CI) systems. Currently, these systems are designed via end-to-end (E2E) optimization, where the encoder is modeled as a neural network layer and is jointly optimized with the decoder. However, the performance of E2E optimization is significantly reduced by the physical constraints imposed on the encoder. Also, since the E2E learns the parameters of the encoder by backpropagating the reconstruction error, it does not promote optimal intermediate outputs and suffers from gradient vanishing. To address these limitations, we reinterpret the concept of knowledge distillation (KD) for designing a physically constrained CI system by transferring the knowledge of a pretrained, less-constrained CI system. Our approach involves three steps: (1) Given the original CI system (student), a teacher system is created by relaxing the constraints on the student's encoder. (2) The teacher is optimized to solve a less-constrained version of the student's problem. (3) The teacher guides the training of the student through two proposed knowledge transfer functions, targeting both the encoder and the decoder feature space. The proposed method can be employed to any imaging modality since the relaxation scheme and the loss functions can be adapted according to the physical acquisition and the employed decoder. This approach was validated on three representative CI modalities: magnetic resonance, single-pixel, and compressive spectral imaging. Simulations show that a teacher system with an encoder that has a structure similar to that of the student encoder provides effective guidance. Our approach achieves significantly improved reconstruction performance and encoder design, outperforming both E2E optimization and traditional non-data-driven encoder designs.

Distilling Knowledge for Designing Computational Imaging Systems

TL;DR

The paper tackles the problem that end-to-end optimization of computational imaging systems is hampered by physical encoder constraints and gradient issues. It introduces a knowledge distillation (KD) framework in which a less-constrained, pretrained teacher guides a constrained student across three stages: teacher creation via relaxation, teacher optimization, and knowledge transfer through two losses that align encoders and decoders. Across MRI, SPC, and SD-CASSI, the KD approach yields higher reconstruction quality (up to ~1.53 dB PSNR gains) and improved encoder designs (lower condition numbers, reduced Gram matrix coherence, and better spectral sampling) compared to E2E baselines, with robustness to noise. The method is generalizable to other CI modalities and tasks, offering a practical alternative to traditional E2E design while maintaining feasible inference. The KD framework thus provides a principled, flexible pathway to design physically feasible, high-performance CI systems.

Abstract

Designing the physical encoder is crucial for accurate image reconstruction in computational imaging (CI) systems. Currently, these systems are designed via end-to-end (E2E) optimization, where the encoder is modeled as a neural network layer and is jointly optimized with the decoder. However, the performance of E2E optimization is significantly reduced by the physical constraints imposed on the encoder. Also, since the E2E learns the parameters of the encoder by backpropagating the reconstruction error, it does not promote optimal intermediate outputs and suffers from gradient vanishing. To address these limitations, we reinterpret the concept of knowledge distillation (KD) for designing a physically constrained CI system by transferring the knowledge of a pretrained, less-constrained CI system. Our approach involves three steps: (1) Given the original CI system (student), a teacher system is created by relaxing the constraints on the student's encoder. (2) The teacher is optimized to solve a less-constrained version of the student's problem. (3) The teacher guides the training of the student through two proposed knowledge transfer functions, targeting both the encoder and the decoder feature space. The proposed method can be employed to any imaging modality since the relaxation scheme and the loss functions can be adapted according to the physical acquisition and the employed decoder. This approach was validated on three representative CI modalities: magnetic resonance, single-pixel, and compressive spectral imaging. Simulations show that a teacher system with an encoder that has a structure similar to that of the student encoder provides effective guidance. Our approach achieves significantly improved reconstruction performance and encoder design, outperforming both E2E optimization and traditional non-data-driven encoder designs.

Paper Structure

This paper contains 28 sections, 10 equations, 14 figures, 5 tables, 1 algorithm.

Figures (14)

  • Figure 1: The student system is a constrained CI system with encoder $\mathbf{A}_{\boldsymbol{\Phi}_s}$ and decoder $\mathcal{M}_{\boldsymbol{\Theta}_s}$. In the first stage, by relaxing the student encoder constraints, a teacher encoder $\mathbf{A}_{\boldsymbol{\Phi}_t}$ is derived. In the second stage, the teacher encoder and its reconstruction network $\mathcal{M}_{\boldsymbol{\Theta}_t}$ are optimized to solve a less-constrained version of the student's problem, resulting in ${\mathbf{A}_{\boldsymbol{\Phi}_t^\star}, \mathcal{M}_{\boldsymbol{\Theta}_t^\star}}$. In the third stage, the knowledge of the pretrained teacher system is used to guide and enhance the performance of the student system’s encoder and decoder.
  • Figure 2: Comparative of E2E optimization and the Proposed KD methodology for designing CI systems. During training, the pretrained teacher guides the learning of the student through the proposed loss functions $\mathcal{L}_{\text{ENC}}$ and $\mathcal{L}_{\text{DEC}}$. For inference, the student system operates independently.
  • Figure 3: Scheme of the U-Net used as computational decoder, C is the number of channels of the input image, C=2 for MR images, C=1 for grayscale images, and C=8 for the multi-spectral images. E determines the number of filters of each convolutional layer of the U-Net, E=1 for MRI, E=4 for the SPC, and E=2 for the SD-CASSI system.
  • Figure 4: Reconstruction performance of the student and baseline MRI systems. The first column shows the teacher-optimized undersampling mask and its corresponding reconstruction. The second column presents the student-optimized mask and its reconstruction. The third column displays the baseline-optimized mask and its reconstruction, while the fourth column contains the ground truth image. The PSNR (dB) and SSIM metrics are reported in the upper-right corner of each reconstruction.
  • Figure 5: Comparison of the student MRI system with the baseline and common $k$-space undersampling masks (spiral and radial). On the left, a visual comparison for $AF=8$ is presented, with PSNR (dB) and SSIM metrics displayed in the upper-right corner of each reconstruction. On the right, a plot comparison for $AF \in \{4, 8, 16\}$ is shown.
  • ...and 9 more figures

Theorems & Definitions (1)

  • Remark 1