Table of Contents
Fetching ...

MFI-ResNet: Efficient ResNet Architecture Optimization via MeanFlow Compression and Selective Incubation

Nuolin Sun, Linyuan Wang, Haonan Wei, Lei Li, Bin Yan

TL;DR

This work reframes ResNet as a discretized ODE and introduces MeanFlow-Incubated ResNet (MFI-ResNet), which compresses each stage by replacing multi-block residuals with one or two MeanFlow modules that model the average velocity field over $t\in[0,1]$. To counteract capacity loss, it employs a selective incubation expansion that reintroduces ResNet blocks for early stages (1–3) while keeping a two-layer MeanFlow stage (4), followed by end-to-end fine-tuning. Empirically, MFI-ResNet achieves ~46% parameter reduction versus ResNet-50 with small accuracy gains on CIFAR-10/100 (up to 0.22%), and similar gains for ResNet-34 variants, demonstrating that generative flow-fields can effectively capture stage-wise feature transformations. These results highlight a meaningful link between discriminative learning and generative modeling at the feature-transformation level and offer a practical path to more parameter-efficient deep networks.

Abstract

ResNet has achieved tremendous success in computer vision through its residual connection mechanism. ResNet can be viewed as a discretized form of ordinary differential equations (ODEs). From this perspective, the multiple residual blocks within a single ResNet stage essentially perform multi-step discrete iterations of the feature transformation for that stage. The recently proposed flow matching model, MeanFlow, enables one-step generative modeling by learning the mean velocity field to transform distributions. Inspired by this, we propose MeanFlow-Incubated ResNet (MFI-ResNet), which employs a compression-expansion strategy to jointly improve parameter efficiency and discriminative performance. In the compression phase, we simplify the multi-layer structure within each ResNet stage to one or two MeanFlow modules to construct a lightweight meta model. In the expansion phase, we apply a selective incubation strategy to the first three stages, expanding them to match the residual block configuration of the baseline ResNet model, while keeping the last stage in MeanFlow form, and fine-tune the incubated model. Experimental results show that on CIFAR-10 and CIFAR-100 datasets, MFI-ResNet achieves remarkable parameter efficiency, reducing parameters by 46.28% and 45.59% compared to ResNet-50, while still improving accuracy by 0.23% and 0.17%, respectively. This demonstrates that generative flow-fields can effectively characterize the feature transformation process in ResNet, providing a new perspective for understanding the relationship between generative modeling and discriminative learning.

MFI-ResNet: Efficient ResNet Architecture Optimization via MeanFlow Compression and Selective Incubation

TL;DR

This work reframes ResNet as a discretized ODE and introduces MeanFlow-Incubated ResNet (MFI-ResNet), which compresses each stage by replacing multi-block residuals with one or two MeanFlow modules that model the average velocity field over . To counteract capacity loss, it employs a selective incubation expansion that reintroduces ResNet blocks for early stages (1–3) while keeping a two-layer MeanFlow stage (4), followed by end-to-end fine-tuning. Empirically, MFI-ResNet achieves ~46% parameter reduction versus ResNet-50 with small accuracy gains on CIFAR-10/100 (up to 0.22%), and similar gains for ResNet-34 variants, demonstrating that generative flow-fields can effectively capture stage-wise feature transformations. These results highlight a meaningful link between discriminative learning and generative modeling at the feature-transformation level and offer a practical path to more parameter-efficient deep networks.

Abstract

ResNet has achieved tremendous success in computer vision through its residual connection mechanism. ResNet can be viewed as a discretized form of ordinary differential equations (ODEs). From this perspective, the multiple residual blocks within a single ResNet stage essentially perform multi-step discrete iterations of the feature transformation for that stage. The recently proposed flow matching model, MeanFlow, enables one-step generative modeling by learning the mean velocity field to transform distributions. Inspired by this, we propose MeanFlow-Incubated ResNet (MFI-ResNet), which employs a compression-expansion strategy to jointly improve parameter efficiency and discriminative performance. In the compression phase, we simplify the multi-layer structure within each ResNet stage to one or two MeanFlow modules to construct a lightweight meta model. In the expansion phase, we apply a selective incubation strategy to the first three stages, expanding them to match the residual block configuration of the baseline ResNet model, while keeping the last stage in MeanFlow form, and fine-tune the incubated model. Experimental results show that on CIFAR-10 and CIFAR-100 datasets, MFI-ResNet achieves remarkable parameter efficiency, reducing parameters by 46.28% and 45.59% compared to ResNet-50, while still improving accuracy by 0.23% and 0.17%, respectively. This demonstrates that generative flow-fields can effectively characterize the feature transformation process in ResNet, providing a new perspective for understanding the relationship between generative modeling and discriminative learning.

Paper Structure

This paper contains 15 sections, 9 equations, 3 figures, 3 tables.

Figures (3)

  • Figure 1: Comparison of feature transformation in a ResNet stage and a MeanFlow stage. (a) A ResNet stage implements a multi-step residual evolution of the feature distribution, where several residual blocks realize the discrete update. (b) A MeanFlow stage models a one-step continuous flow of the feature distribution, directly mapping via an average velocity field.
  • Figure 2: The MFI-ResNet architecture. Top-left: We train four independent MeanFlow modules to learn explicit feature mappings from pre-trained ResNet stages, constructing a lightweight meta model. Bottom-left: The selective incubation strategy progressively replaces MeanFlow modules with corresponding ResNet residual blocks for stages 1-3. Right: The final MFI-ResNet hybrid architecture combines incubated ResNet stages (1-3) with the retained MeanFlow module (stage 4), achieving parameter efficiency while maintaining discriminative performance.
  • Figure 3: Selective incubation strategy for MFI-ResNet construction. Top: Four independent MeanFlow modules are trained to learn stage-wise feature mappings from pre-trained ResNet. Middle: Deep Incubation process where each stage (1-3) is independently replaced with corresponding pre-trained ResNet residual blocks while keeping other stages frozen. Bottom: Final MFI-ResNet-50 architecture combines incubated stages 1-3 with the retained two-layer MeanFlow module for stage 4, followed by global fine-tuning to achieve optimal performance.