Table of Contents
Fetching ...

MatConvNet - Convolutional Neural Networks for MATLAB

Andrea Vedaldi, Karel Lenc

TL;DR

MatConvNet delivers a MATLAB-native framework for CNNs by exposing core building blocks (convolution, pooling, normalization, etc.) as simple functions, enabling rapid prototyping and research within the MATLAB ecosystem. It combines a pair of wrappers (SimpleNN and DagNN) with pre-trained models and learning utilities, supported by GPU-accelerated implementations and optional CuDNN integration to scale to large datasets like ImageNet. The paper details the mathematical foundation, network representations (sequences and DAGs), backpropagation mechanics, and extensive implementation notes across all blocks, including derivatives, receptive field calculations, and numerical stability considerations. This results in a flexible, educational, and scalable CNN toolkit that bridges MATLAB workflows with modern deep learning capabilities, suitable for both experimentation and deployment of large-scale vision models.

Abstract

MatConvNet is an implementation of Convolutional Neural Networks (CNNs) for MATLAB. The toolbox is designed with an emphasis on simplicity and flexibility. It exposes the building blocks of CNNs as easy-to-use MATLAB functions, providing routines for computing linear convolutions with filter banks, feature pooling, and many more. In this manner, MatConvNet allows fast prototyping of new CNN architectures; at the same time, it supports efficient computation on CPU and GPU allowing to train complex models on large datasets such as ImageNet ILSVRC. This document provides an overview of CNNs and how they are implemented in MatConvNet and gives the technical details of each computational block in the toolbox.

MatConvNet - Convolutional Neural Networks for MATLAB

TL;DR

MatConvNet delivers a MATLAB-native framework for CNNs by exposing core building blocks (convolution, pooling, normalization, etc.) as simple functions, enabling rapid prototyping and research within the MATLAB ecosystem. It combines a pair of wrappers (SimpleNN and DagNN) with pre-trained models and learning utilities, supported by GPU-accelerated implementations and optional CuDNN integration to scale to large datasets like ImageNet. The paper details the mathematical foundation, network representations (sequences and DAGs), backpropagation mechanics, and extensive implementation notes across all blocks, including derivatives, receptive field calculations, and numerical stability considerations. This results in a flexible, educational, and scalable CNN toolkit that bridges MATLAB workflows with modern deep learning capabilities, suitable for both experimentation and deployment of large-scale vision models.

Abstract

MatConvNet is an implementation of Convolutional Neural Networks (CNNs) for MATLAB. The toolbox is designed with an emphasis on simplicity and flexibility. It exposes the building blocks of CNNs as easy-to-use MATLAB functions, providing routines for computing linear convolutions with filter banks, feature pooling, and many more. In this manner, MatConvNet allows fast prototyping of new CNN architectures; at the same time, it supports efficient computation on CPU and GPU allowing to train complex models on large datasets such as ImageNet ILSVRC. This document provides an overview of CNNs and how they are implemented in MatConvNet and gives the technical details of each computational block in the toolbox.

Paper Structure

This paper contains 91 sections, 128 equations, 6 figures, 2 tables.

Figures (6)

  • Figure 1: A complete example including download, installing, compiling and running MatConvNet to classify one of MATLAB stock images using a large CNN pre-trained on ImageNet.
  • Figure 2: Training AlexNet on ImageNet ILSVRC: dropout vs batch normalisation.
  • Figure 3: Example DAG.
  • Figure 4: Backpropagation network for a DAG.
  • Figure 5: Convolution. The figure illustrates the process of filtering a 1D signal $\mathbf{x}$ by a filter $f$ to obtain a signal $\mathbf{y}$. The filter has $H'=4$ elements and is applied with a stride of $S_h =2$ samples. The purple areas represented padding $P_-=2$ and $P_+=3$ which is zero-filled. Filters are applied in a sliding-window manner across the input signal. The samples of $\mathbf{x}$ involved in the calculation of a sample of $\mathbf{y}$ are shown with arrow. Note that the rightmost sample of $\mathbf{x}$ is never processed by any filter application due to the sampling step. While in this case the sample is in the padded region, this can happen also without padding.
  • ...and 1 more figures

Theorems & Definitions (1)

  • Example 1