COLORA: Efficient Fine-Tuning for Convolutional Models with a Study Case on Optical Coherence Tomography Image Classification
Mariano Rivera, Angello Hoyos
TL;DR
CoLoRA bridges LoRA and CNNs by factorizing convolutional updates into separable depthwise and pointwise components, maintaining a frozen backbone while training a small set of parameters. Updates are merged into the pretrained weights after each epoch to preserve inference cost, yielding a practical parameter-efficient fine-tuning approach. On OCTMNISTv2, CoLoRA applied to VGG16 and ResNet50v2 achieves competitive accuracy and AUC with significantly fewer trainable parameters and roughly 20% faster per-epoch training, illustrating strong applicability to medical image classification. The work highlights stability, deployability, and potential for broad adoption across CNN-based models and modalities, with future directions toward larger datasets, 1D/3D domains, and hybrid efficiency strategies.
Abstract
We introduce CoLoRA (Convolutional Low-Rank Adaptation), a parameter-efficient fine-tuning method for convolutional neural networks (CNNs). CoLoRA extends LoRA to convolutional layers by decomposing kernel updates into lightweight depthwise and pointwise components.This design reduces the number of trainable parameters to 0.2 compared to conventional fine-tuning, preserves the original model size, and allows merging updates into the pretrained weights after each epoch, keeping inference complexity unchanged. On OCTMNISTv2, CoLoRA applied to VGG16 and ResNet50 achieves up to 1 percent accuracy and 0.013 AUC improvements over strong baselines (Vision Transformers, state-space, and Kolmogorov Arnold models) while reducing per-epoch training time by nearly 20 percent. Results indicate that CoLoRA provides a stable and effective alternative to full fine-tuning for medical image classification.
