Object-Attribute-Relation Representation Based Video Semantic Communication

Qiyuan Du; Yiping Duan; Qianqian Yang; Xiaoming Tao; Mérouane Debbah

Object-Attribute-Relation Representation Based Video Semantic Communication

Qiyuan Du, Yiping Duan, Qianqian Yang, Xiaoming Tao, Mérouane Debbah

TL;DR

The paper tackles the challenge of efficient video transmission under bandwidth and noise constraints by introducing a structured semantic representation, Object-Attribute-Relation (OAR), for videos. OAR encodes scenes as graphs of objects, their attributes, and inter-object relations, enabling low-bit-rate transmission and generative reconstruction conditioned on a reference frame. The authors integrate OAR into both video coding and joint source-channel coding (JSCC) pipelines, including an OAR-modulated image JSCC backbone and an OAR-based video transmission pipeline that transmits OAR sequences alongside reference frames. Experimental results on traffic-surveillance datasets show that OAR-based coding improves perceptual quality and downstream task performance at low bit-rates and robustly supports transmission over noisy channels, with notable rate savings and mAP gains compared to conventional codecs and prior semantic approaches. The work also provides extensive ablation studies and discusses inference-time considerations and potential applications such as privacy-aware or scene-specific semantic control.

Abstract

With the rapid growth of multimedia data volume, there is an increasing need for efficient video transmission in applications such as virtual reality and future video streaming services. Semantic communication is emerging as a vital technique for ensuring efficient and reliable transmission in low-bandwidth, high-noise settings. However, most current approaches focus on joint source-channel coding (JSCC) that depends on end-to-end training. These methods often lack an interpretable semantic representation and struggle with adaptability to various downstream tasks. In this paper, we introduce the use of object-attribute-relation (OAR) as a semantic framework for videos to facilitate low bit-rate coding and enhance the JSCC process for more effective video transmission. We utilize OAR sequences for both low bit-rate representation and generative video reconstruction. Additionally, we incorporate OAR into the image JSCC model to prioritize communication resources for areas more critical to downstream tasks. Our experiments on traffic surveillance video datasets assess the effectiveness of our approach in terms of video transmission performance. The empirical findings demonstrate that our OAR-based video coding method not only outperforms H.265 coding at lower bit-rates but also synergizes with JSCC to deliver robust and efficient video transmission.

Object-Attribute-Relation Representation Based Video Semantic Communication

TL;DR

Abstract

Paper Structure (29 sections, 11 equations, 15 figures, 4 tables)

This paper contains 29 sections, 11 equations, 15 figures, 4 tables.

Introduction
Object-Attribute-Relation-Based Video Representation and Coding
Object-Attribute-Relation Formulation and OAR Based Video Communication Framework
OAR Extraction and Representation
OAR Based Video Generative Reconstruction
OAR Embedding and Graph Computing
OAR Layout Generation
Optical Flow Estimation and Frame Deformation
Image Synthesis and Fusion
OAR-assisted JSCC and Video Transmission
OAR-modulated JSCC
OAR-based Video Transmission Pipeline
Experiments and Results
Dataset, Performance Metrics and Baselines
Dataset Preparation
...and 14 more sections

Figures (15)

Figure 1: Comparison between traditional communication, classical semantic communication (without the OAR representation module) and the proposed OAR-based video transmission frameworks.
Figure 2: Overall framework of OAR-based video compressive coding and transmission. Frames are represented by OAR and transmitted via LDPC channel coding and QAM modulation. Additionally, the reference frame is coded and transmitted via OAR-assisted JSCC.
Figure 3: Framework of OAR-based video generation and visualizations for intermediate results.
Figure 4: The framework of OAR-assisted image JSCC. The SNR is assumed to be available at both the transmitter and the receiver. OAR undergoes lossless transmission, ensuring that both the transmitter and receiver obtain identical OAR features ${\bf F}_\text{l}$ through the utilization of OAR feature extraction networks with identical parameters.
Figure 5: Performance comparison of the proposed OAR based method with H.264, H.265 and DVC at different bit-rates on the UA-DETRAC dataset.
...and 10 more figures

Object-Attribute-Relation Representation Based Video Semantic Communication

TL;DR

Abstract

Object-Attribute-Relation Representation Based Video Semantic Communication

Authors

TL;DR

Abstract

Table of Contents

Figures (15)