Table of Contents
Fetching ...

Is Bigger Always Better? Efficiency Analysis in Resource-Constrained Small Object Detection

Kwame Mbobda-Kuate, Gabriel Kasmi

TL;DR

A systematic efficiency analysis on rooftop PV detection in Madagascar finds small high-resolution configurations are Pareto-dominant across all 44 setups in the joint accuracy-throughput space, leaving no tradeoff to resolve.

Abstract

Scaling laws assume larger models trained on more data consistently outperform smaller ones -- an assumption that drives model selection in computer vision but remains untested in resource-constrained Earth observation (EO). We conduct a systematic efficiency analysis across three scaling dimensions: model size, dataset size, and input resolution, on rooftop PV detection in Madagascar. Optimizing for model efficiency (mAP$_{50}$ per unit of model size), we find a consistent efficiency inversion: YOLO11N achieves both the highest efficiency ($24\times$ higher than YOLO11X) and the highest absolute mAP$_{50}$ (0.617). Resolution is the dominant resource allocation lever ($+$120% efficiency gain), while additional data yields negligible returns at low resolution. These findings are robust to the deployment objective: small high-resolution configurations are Pareto-dominant across all 44 setups in the joint accuracy-throughput space, leaving no tradeoff to resolve. In data-scarce EO, bigger is not just unnecessary: it can be worse.

Is Bigger Always Better? Efficiency Analysis in Resource-Constrained Small Object Detection

TL;DR

A systematic efficiency analysis on rooftop PV detection in Madagascar finds small high-resolution configurations are Pareto-dominant across all 44 setups in the joint accuracy-throughput space, leaving no tradeoff to resolve.

Abstract

Scaling laws assume larger models trained on more data consistently outperform smaller ones -- an assumption that drives model selection in computer vision but remains untested in resource-constrained Earth observation (EO). We conduct a systematic efficiency analysis across three scaling dimensions: model size, dataset size, and input resolution, on rooftop PV detection in Madagascar. Optimizing for model efficiency (mAP per unit of model size), we find a consistent efficiency inversion: YOLO11N achieves both the highest efficiency ( higher than YOLO11X) and the highest absolute mAP (0.617). Resolution is the dominant resource allocation lever (120% efficiency gain), while additional data yields negligible returns at low resolution. These findings are robust to the deployment objective: small high-resolution configurations are Pareto-dominant across all 44 setups in the joint accuracy-throughput space, leaving no tradeoff to resolve. In data-scarce EO, bigger is not just unnecessary: it can be worse.
Paper Structure (41 sections, 1 equation, 9 figures, 8 tables)

This paper contains 41 sections, 1 equation, 9 figures, 8 tables.

Figures (9)

  • Figure 1: Small models at high resolution dominate across all deployment objectives. Detection performance (mAP$_{50}$) versus inference speed (FPS) for all 44 experimental configurations on the OpenStat Madagascar rooftop PV dataset OpenStat. Marker shape encodes input resolution (circle: 416 px, square: 640 px, triangle: 1280 px) and marker size encodes the training dataset fraction (10% to 100%). The dashed line delineates the Pareto frontier (the set of configurations for which no other configuration is superior on both axes simultaneously). YOLO11N and YOLO11S at 1280 px lie at the apex, simultaneously maximising accuracy and throughput with no accuracy--speed tradeoff to resolve.
  • Figure 2: Cumulative distribution of bounding box areas (normalized by image area) across the training set. The median normalized area is 0.006, and 64% of boxes fall below the 0.01 threshold commonly used to define small objects in detection benchmarks lin2017feature. This confirms that rooftop PV detection in drone imagery is structurally a small object task, justifying the central role of input resolution in our efficiency analysis.
  • Figure 3: Efficiency ranking of the evaluated models (mAP per 10 MB). YOLO11N achieves $24\times$ higher efficiency than YOLO11X while also reaching the highest absolute mAP$_{50}$ --- ruling out any accuracy--efficiency tradeoff.
  • Figure 4: Data efficiency curves for each YOLO11 variant averaged across input resolutions. YOLO11N shows near-zero sensitivity to data volume ($-$0.5% from 10% to 100%), while larger models exhibit pronounced data hunger. Relative gains increase monotonically with model size.
  • Figure 5: mAP$_{50}$ vs. overparameterization ratio $\rho = \text{params} / N_{\text{train}}$. A negative log-linear trend confirms that higher overparameterization predicts lower performance, consistent with the bias-variance tradeoff. Each point is one configuration (model $\times$ fraction $\times$ resolution).
  • ...and 4 more figures