Table of Contents
Fetching ...

Resnet in Resnet: Generalizing Residual Architectures

Sasha Targ, Diogo Almeida, Kevin Lyman

TL;DR

The paper targets limitations of ResNets arising from fixed identity shortcuts by introducing a generalized residual block with dual parallel streams (residual and transient). This dual-stream, cross-constrained design is instantiated as RiR using a ResNet Init initialization that adds negligible overhead. Empirical results on CIFAR-10/100 demonstrate consistent gains over ResNet and set a new state-of-the-art on CIFAR-100, with ablations confirming contributions from both streams. The approach broadens residual-network design space by enabling deeper, more flexible processing while preserving optimization benefits.

Abstract

Residual networks (ResNets) have recently achieved state-of-the-art on challenging computer vision tasks. We introduce Resnet in Resnet (RiR): a deep dual-stream architecture that generalizes ResNets and standard CNNs and is easily implemented with no computational overhead. RiR consistently improves performance over ResNets, outperforms architectures with similar amounts of augmentation on CIFAR-10, and establishes a new state-of-the-art on CIFAR-100.

Resnet in Resnet: Generalizing Residual Architectures

TL;DR

The paper targets limitations of ResNets arising from fixed identity shortcuts by introducing a generalized residual block with dual parallel streams (residual and transient). This dual-stream, cross-constrained design is instantiated as RiR using a ResNet Init initialization that adds negligible overhead. Empirical results on CIFAR-10/100 demonstrate consistent gains over ResNet and set a new state-of-the-art on CIFAR-100, with ablations confirming contributions from both streams. The approach broadens residual-network design space by enabling deeper, more flexible processing while preserving optimization benefits.

Abstract

Residual networks (ResNets) have recently achieved state-of-the-art on challenging computer vision tasks. We introduce Resnet in Resnet (RiR): a deep dual-stream architecture that generalizes ResNets and standard CNNs and is easily implemented with no computational overhead. RiR consistently improves performance over ResNets, outperforms architectures with similar amounts of augmentation on CIFAR-10, and establishes a new state-of-the-art on CIFAR-100.

Paper Structure

This paper contains 8 sections, 2 equations, 4 figures, 7 tables.

Figures (4)

  • Figure 1: (a) 2-layer ResNet block. (b) 2 generalized residual blocks (ResNet Init). (c) 2-layer ResNet block from 2 generalized residual blocks (grayed out connections are 0). (d) 2-layer RiR block.
  • Figure 2: Relationship between standard CNN, ResNet, ResNet Init, and RiR architectures.
  • Figure 3: Effect of ablating each stream of the generalized residual network architecture
  • Figure 4: ResNet and RiR with increased layers/block. All models have 15 blocks.