Table of Contents
Fetching ...

Context Propagation from Proposals for Semantic Video Object Segmentation

Tinghuai Wang

TL;DR

The paper tackles semantic video object segmentation by learning and propagating semantic context from video object proposals. It builds a sparse, similarity-based graph of superpixels, derives context exemplars from annotated frames, and performs a two-stage link-propagation to predict cross-frame context links, which are then integrated into a fully connected CRF for labeling. The approach yields significant gains on YouTube-Objects, evidencing that modeling high-level semantic context across space and time improves robustness to ambiguities and appearance variations. This context-centric, nonparametric framework offers a practical method for leveraging proposal-driven semantics without heavy training data while delivering competitive performance gains in challenging video data.

Abstract

In this paper, we propose a novel approach to learning semantic contextual relationships in videos for semantic object segmentation. Our algorithm derives the semantic contexts from video object proposals which encode the key evolution of objects and the relationship among objects over the spatio-temporal domain. This semantic contexts are propagated across the video to estimate the pairwise contexts between all pairs of local superpixels which are integrated into a conditional random field in the form of pairwise potentials and infers the per-superpixel semantic labels. The experiments demonstrate that our contexts learning and propagation model effectively improves the robustness of resolving visual ambiguities in semantic video object segmentation compared with the state-of-the-art methods.

Context Propagation from Proposals for Semantic Video Object Segmentation

TL;DR

The paper tackles semantic video object segmentation by learning and propagating semantic context from video object proposals. It builds a sparse, similarity-based graph of superpixels, derives context exemplars from annotated frames, and performs a two-stage link-propagation to predict cross-frame context links, which are then integrated into a fully connected CRF for labeling. The approach yields significant gains on YouTube-Objects, evidencing that modeling high-level semantic context across space and time improves robustness to ambiguities and appearance variations. This context-centric, nonparametric framework offers a practical method for leveraging proposal-driven semantics without heavy training data while delivering competitive performance gains in challenging video data.

Abstract

In this paper, we propose a novel approach to learning semantic contextual relationships in videos for semantic object segmentation. Our algorithm derives the semantic contexts from video object proposals which encode the key evolution of objects and the relationship among objects over the spatio-temporal domain. This semantic contexts are propagated across the video to estimate the pairwise contexts between all pairs of local superpixels which are integrated into a conditional random field in the form of pairwise potentials and infers the per-superpixel semantic labels. The experiments demonstrate that our contexts learning and propagation model effectively improves the robustness of resolving visual ambiguities in semantic video object segmentation compared with the state-of-the-art methods.
Paper Structure (9 sections, 7 equations, 1 figure, 1 table, 1 algorithm)

This paper contains 9 sections, 7 equations, 1 figure, 1 table, 1 algorithm.

Figures (1)

  • Figure 1: Qualitative results of our algorithm on YouTube-Objects Dataset.