Table of Contents
Fetching ...

Deep Learning of Representations: Looking Forward

Yoshua Bengio

TL;DR

This paper surveys forward-looking challenges for deep representation learning, focusing on scaling to large models and datasets, mitigating optimization difficulties, improving inference and sampling for probabilistic models, and achieving disentangled representations. It outlines concrete solution paths such as asynchronous SGD, sparse updates, and conditional computation for scaling, along with curriculum learning and architecture changes to ease optimization. It also proposes moving beyond explicit latent-variable marginalization toward learned computational graphs and generative stochastic networks to handle multi-modal posteriors efficiently. The discussed ideas aim to yield more scalable, robust, and interpretable deep representations with faster inference and better transfer across tasks, signaling a path toward more general and capable AI systems.

Abstract

Deep learning research aims at discovering learning algorithms that discover multiple levels of distributed representations, with higher levels representing more abstract concepts. Although the study of deep learning has already led to impressive theoretical results, learning algorithms and breakthrough experiments, several challenges lie ahead. This paper proposes to examine some of these challenges, centering on the questions of scaling deep learning algorithms to much larger models and datasets, reducing optimization difficulties due to ill-conditioning or local minima, designing more efficient and powerful inference and sampling procedures, and learning to disentangle the factors of variation underlying the observed data. It also proposes a few forward-looking research directions aimed at overcoming these challenges.

Deep Learning of Representations: Looking Forward

TL;DR

This paper surveys forward-looking challenges for deep representation learning, focusing on scaling to large models and datasets, mitigating optimization difficulties, improving inference and sampling for probabilistic models, and achieving disentangled representations. It outlines concrete solution paths such as asynchronous SGD, sparse updates, and conditional computation for scaling, along with curriculum learning and architecture changes to ease optimization. It also proposes moving beyond explicit latent-variable marginalization toward learned computational graphs and generative stochastic networks to handle multi-modal posteriors efficiently. The discussed ideas aim to yield more scalable, robust, and interpretable deep representations with faster inference and better transfer across tasks, signaling a path toward more general and capable AI systems.

Abstract

Deep learning research aims at discovering learning algorithms that discover multiple levels of distributed representations, with higher levels representing more abstract concepts. Although the study of deep learning has already led to impressive theoretical results, learning algorithms and breakthrough experiments, several challenges lie ahead. This paper proposes to examine some of these challenges, centering on the questions of scaling deep learning algorithms to much larger models and datasets, reducing optimization difficulties due to ill-conditioning or local minima, designing more efficient and powerful inference and sampling procedures, and learning to disentangle the factors of variation underlying the observed data. It also proposes a few forward-looking research directions aimed at overcoming these challenges.

Paper Structure

This paper contains 33 sections.