Table of Contents
Fetching ...

Region-of-Interest-Guided Deep Joint Source-Channel Coding for Image Transmission

Hansung Choi, Daewon Seo

TL;DR

This work addresses the mismatch between overall image fidelity and user-perceived quality by focusing transmission resources on regions of interest (ROI). It introduces ROI-JSCC, a ROI-guided deep joint source-channel coding framework that jointly optimizes ROI embedding, ROI-aware processing, ROI-specific loss, and ROI-adaptive bandwidth to enhance ROI fidelity while preserving average image quality. Empirical results on standard datasets show substantial ROI improvements with competitive or favorable overall quality and computational efficiency, validating the approach across AWGN and fast Rayleigh channels. The proposed method offers practical benefits for ROI-critical applications such as autonomous driving and VR/AR by delivering higher ROI clarity without retraining for different ROI locations.

Abstract

Deep joint source-channel coding (deepJSCC) methods have shown promising improvements in communication performance over wireless networks. However, existing approaches primarily focus on enhancing overall image reconstruction quality, which may not fully align with user experiences, often driven by the quality of regions of interest (ROI). Motivated by this, we propose ROI-guided joint source-channel coding (ROI-JSCC), a novel deepJSCC framework that prioritizes high-quality transmission of ROI. The ROI-JSCC consists of four key components: (1) Image ROI embedding, (2) ROI-guided split processing, (3) ROI-based loss function design, and (4) ROI-adaptive bandwidth allocation. Together, these components allow ROI-JSCC to selectively enhance the ROI reconstruction quality at varying ROI positions while maintaining overall image quality with minimal computational overhead. Experimental results under diverse communication environments demonstrate that ROI-JSCC significantly improves ROI reconstruction quality while maintaining competitive average image quality compared to recent state-of-the-art methods.

Region-of-Interest-Guided Deep Joint Source-Channel Coding for Image Transmission

TL;DR

This work addresses the mismatch between overall image fidelity and user-perceived quality by focusing transmission resources on regions of interest (ROI). It introduces ROI-JSCC, a ROI-guided deep joint source-channel coding framework that jointly optimizes ROI embedding, ROI-aware processing, ROI-specific loss, and ROI-adaptive bandwidth to enhance ROI fidelity while preserving average image quality. Empirical results on standard datasets show substantial ROI improvements with competitive or favorable overall quality and computational efficiency, validating the approach across AWGN and fast Rayleigh channels. The proposed method offers practical benefits for ROI-critical applications such as autonomous driving and VR/AR by delivering higher ROI clarity without retraining for different ROI locations.

Abstract

Deep joint source-channel coding (deepJSCC) methods have shown promising improvements in communication performance over wireless networks. However, existing approaches primarily focus on enhancing overall image reconstruction quality, which may not fully align with user experiences, often driven by the quality of regions of interest (ROI). Motivated by this, we propose ROI-guided joint source-channel coding (ROI-JSCC), a novel deepJSCC framework that prioritizes high-quality transmission of ROI. The ROI-JSCC consists of four key components: (1) Image ROI embedding, (2) ROI-guided split processing, (3) ROI-based loss function design, and (4) ROI-adaptive bandwidth allocation. Together, these components allow ROI-JSCC to selectively enhance the ROI reconstruction quality at varying ROI positions while maintaining overall image quality with minimal computational overhead. Experimental results under diverse communication environments demonstrate that ROI-JSCC significantly improves ROI reconstruction quality while maintaining competitive average image quality compared to recent state-of-the-art methods.

Paper Structure

This paper contains 18 sections, 3 equations, 9 figures.

Figures (9)

  • Figure 1: ROI-JSCC communication system.
  • Figure 2: The structures of the encoder and the decoder.
  • Figure 3: ROI block structure and process for ROI $\gamma =(2,2)$.
  • Figure 4: The ROI importance mask generation process for ROI $\gamma =(2,2)$.
  • Figure 5: The left figures of the vertical line show ${\textrm{PSNR}}_{{\textrm{ROI}}}$ (solid lines) and ${\textrm{PSNR}}_{{\textrm{Avg}}}$ (dotted lines) results of DIV2K validation dataset under different CPP and SNR environments. The right figure of the vertical line shows the ${\textrm{PSNR}}_{{\textrm{ROI}}}$ results of the DIV2K validation dataset for the ablation study of the ROI-based mechanisms in ROI-JSCC.
  • ...and 4 more figures