Table of Contents
Fetching ...

Learning Deep Structured Models

Liang-Chieh Chen, Alexander G. Schwing, Alan L. Yuille, Raquel Urtasun

TL;DR

The paper addresses predicting multiple interdependent outputs by combining Markov random fields with deep feature learning. It introduces a training algorithm that jointly learns the structured MRF parameters and the deep representations that form the MRF potentials, enabling end-to-end optimization. The method blends learning and inference efficiently with GPU acceleration and demonstrates strong gains on word prediction from noisy images and Flickr multi-class classification. This work provides a scalable approach to capture output dependencies in deep structured models, with implications for improved structured prediction in vision and language tasks.

Abstract

Many problems in real-world applications involve predicting several random variables which are statistically related. Markov random fields (MRFs) are a great mathematical tool to encode such relationships. The goal of this paper is to combine MRFs with deep learning algorithms to estimate complex representations while taking into account the dependencies between the output random variables. Towards this goal, we propose a training algorithm that is able to learn structured models jointly with deep features that form the MRF potentials. Our approach is efficient as it blends learning and inference and makes use of GPU acceleration. We demonstrate the effectiveness of our algorithm in the tasks of predicting words from noisy images, as well as multi-class classification of Flickr photographs. We show that joint learning of the deep features and the MRF parameters results in significant performance gains.

Learning Deep Structured Models

TL;DR

The paper addresses predicting multiple interdependent outputs by combining Markov random fields with deep feature learning. It introduces a training algorithm that jointly learns the structured MRF parameters and the deep representations that form the MRF potentials, enabling end-to-end optimization. The method blends learning and inference efficiently with GPU acceleration and demonstrates strong gains on word prediction from noisy images and Flickr multi-class classification. This work provides a scalable approach to capture output dependencies in deep structured models, with implications for improved structured prediction in vision and language tasks.

Abstract

Many problems in real-world applications involve predicting several random variables which are statistically related. Markov random fields (MRFs) are a great mathematical tool to encode such relationships. The goal of this paper is to combine MRFs with deep learning algorithms to estimate complex representations while taking into account the dependencies between the output random variables. Towards this goal, we propose a training algorithm that is able to learn structured models jointly with deep features that form the MRF potentials. Our approach is efficient as it blends learning and inference and makes use of GPU acceleration. We demonstrate the effectiveness of our algorithm in the tasks of predicting words from noisy images, as well as multi-class classification of Flickr photographs. We show that joint learning of the deep features and the MRF parameters results in significant performance gains.

Paper Structure

This paper contains 12 sections, 1 figure, 1 table.

Figures (1)

  • Figure 1: Sample Figure Caption