Table of Contents
Fetching ...

Moving Object Proposals with Deep Learned Optical Flow for Video Object Segmentation

Ge Shi, Zhili Yang

TL;DR

The paper tackles video moving-object segmentation by leveraging unsupervised optical flow to guide segmentation. It proposes a two-stage pipeline: first train an unsupervised optical-flow network (UnFlow) and then feed its output into a SegNet encoder-decoder to produce moving-object proposals, trained on DAVIS 2017. Key contributions include fine-tuning UnFlow on DAVIS 2017 and adapting SegNet for two-class motion segmentation, with an implementation in TensorFlow on AWS EC2. The results demonstrate the feasibility of motion-guided segmentation, but reveal limitations due to lack of semantic information and boundary artifacts, suggesting future work to integrate temporal models and semantic cues for improved performance.

Abstract

Dynamic scene understanding is one of the most conspicuous field of interest among computer vision community. In order to enhance dynamic scene understanding, pixel-wise segmentation with neural networks is widely accepted. The latest researches on pixel-wise segmentation combined semantic and motion information and produced good performance. In this work, we propose a state of art architecture of neural networks to accurately and efficiently get the moving object proposals (MOP). We first train an unsupervised convolutional neural network (UnFlow) to generate optical flow estimation. Then we render the output of optical flow net to a fully convolutional SegNet model. The main contribution of our work is (1) Fine-tuning the pretrained optical flow model on the brand new DAVIS Dataset; (2) Leveraging fully convolutional neural networks with Encoder-Decoder architecture to segment objects. We developed the codes with TensorFlow, and executed the training and evaluation processes on an AWS EC2 instance.

Moving Object Proposals with Deep Learned Optical Flow for Video Object Segmentation

TL;DR

The paper tackles video moving-object segmentation by leveraging unsupervised optical flow to guide segmentation. It proposes a two-stage pipeline: first train an unsupervised optical-flow network (UnFlow) and then feed its output into a SegNet encoder-decoder to produce moving-object proposals, trained on DAVIS 2017. Key contributions include fine-tuning UnFlow on DAVIS 2017 and adapting SegNet for two-class motion segmentation, with an implementation in TensorFlow on AWS EC2. The results demonstrate the feasibility of motion-guided segmentation, but reveal limitations due to lack of semantic information and boundary artifacts, suggesting future work to integrate temporal models and semantic cues for improved performance.

Abstract

Dynamic scene understanding is one of the most conspicuous field of interest among computer vision community. In order to enhance dynamic scene understanding, pixel-wise segmentation with neural networks is widely accepted. The latest researches on pixel-wise segmentation combined semantic and motion information and produced good performance. In this work, we propose a state of art architecture of neural networks to accurately and efficiently get the moving object proposals (MOP). We first train an unsupervised convolutional neural network (UnFlow) to generate optical flow estimation. Then we render the output of optical flow net to a fully convolutional SegNet model. The main contribution of our work is (1) Fine-tuning the pretrained optical flow model on the brand new DAVIS Dataset; (2) Leveraging fully convolutional neural networks with Encoder-Decoder architecture to segment objects. We developed the codes with TensorFlow, and executed the training and evaluation processes on an AWS EC2 instance.
Paper Structure (23 sections, 11 equations, 8 figures, 1 table)

This paper contains 23 sections, 11 equations, 8 figures, 1 table.

Figures (8)

  • Figure 1: Fusion static object information and motion information to segment images.
  • Figure 2: Convolution and deconvolution layers
  • Figure 3: Unsupervised learning model based on variational constraint.
  • Figure 4: SegNet Architecture
  • Figure 5: SegNet and Unflow predictions of the samples from DAVIS 2017
  • ...and 3 more figures