Enhancing CNN Classification with Lamarckian Memetic Algorithms and Local Search
Akhilbaran Ghosh, Rama Sai Adithya Kalidindi
TL;DR
This paper addresses the challenge of optimizing CNN weights using gradient-free methods, particularly to overcome local minima in high-parameter networks. It proposes a two-stage training approach that pre-trains a CNN as a feature extractor and then optimizes the final MLP weights with a Lamarckian memetic algorithm augmented by local search. The study shows that memetic optimization with local refinement can achieve competitive accuracy and faster convergence than a pure genetic algorithm and, under certain settings, approaches gradient-based ADAM performance, especially given high computational complexity. It also reports that treating the problem as a single-objective optimization yields quicker convergence than a NSGA-II multi-objective formulation, highlighting practical implications for robust, efficient CNN weight optimization. The work suggests future extensions to transfer learning, unsupervised feature extraction, and integration with newer architectures like vision transformers.
Abstract
Optimization is critical for optimal performance in deep neural networks (DNNs). Traditional gradient-based methods often face challenges like local minima entrapment. This paper explores population-based metaheuristic optimization algorithms for image classification networks. We propose a novel approach integrating a two-stage training technique with population-based optimization algorithms incorporating local search capabilities. Our experiments demonstrate that the proposed method outperforms state-of-the-art gradient-based techniques, such as ADAM, in accuracy and computational efficiency, particularly with high computational complexity and numerous trainable parameters. The results suggest that our approach offers a robust alternative to traditional methods for weight optimization in convolutional neural networks (CNNs). Future work will explore integrating adaptive mechanisms for parameter tuning and applying the proposed method to other types of neural networks and real-time applications.
