Table of Contents
Fetching ...

Evolutionary Architecture Search through Grammar-Based Sequence Alignment

Adri Gómez Martín, Felix Möller, Steven McDonagh, Monica Abella, Manuel Desco, Elliot J. Crowley, Aaron Klein, Linus Ericsson

TL;DR

This work tackles neural architecture search in expressive grammar-based spaces by introducing two variants of constrained Smith-Waterman (CSWX and RCSWX) to compute edit distances and generate functionally coherent hybrids. By serializing architectures into token sequences and applying dynamic programming, the authors achieve substantial speedups over graph-based distances and enable rigorous diversity and landscape analysis. Empirical results across diverse NAS spaces show competitive performance and reveal how the proposed distance metrics uncover local smoothness and global fragmentation, informing future search strategies. Overall, the method provides scalable, metric-driven crossover and diversity control for complex grammar-based NAS, with potential applicability beyond NAS to other sequence-encoded graph/tree domains.

Abstract

Neural architecture search (NAS) in expressive search spaces is a computationally hard problem, but it also holds the potential to automatically discover completely novel and performant architectures. To achieve this we need effective search algorithms that can identify powerful components and reuse them in new candidate architectures. In this paper, we introduce two adapted variants of the Smith-Waterman algorithm for local sequence alignment and use them to compute the edit distance in a grammar-based evolutionary architecture search. These algorithms enable us to efficiently calculate a distance metric for neural architectures and to generate a set of hybrid offspring from two parent models. This facilitates the deployment of crossover-based search heuristics, allows us to perform a thorough analysis on the architectural loss landscape, and track population diversity during search. We highlight how our method vastly improves computational complexity over previous work and enables us to efficiently compute shortest paths between architectures. When instantiating the crossover in evolutionary searches, we achieve competitive results, outperforming competing methods. Future work can build upon this new tool, discovering novel components that can be used more broadly across neural architecture design, and broadening its applications beyond NAS.

Evolutionary Architecture Search through Grammar-Based Sequence Alignment

TL;DR

This work tackles neural architecture search in expressive grammar-based spaces by introducing two variants of constrained Smith-Waterman (CSWX and RCSWX) to compute edit distances and generate functionally coherent hybrids. By serializing architectures into token sequences and applying dynamic programming, the authors achieve substantial speedups over graph-based distances and enable rigorous diversity and landscape analysis. Empirical results across diverse NAS spaces show competitive performance and reveal how the proposed distance metrics uncover local smoothness and global fragmentation, informing future search strategies. Overall, the method provides scalable, metric-driven crossover and diversity control for complex grammar-based NAS, with potential applicability beyond NAS to other sequence-encoded graph/tree domains.

Abstract

Neural architecture search (NAS) in expressive search spaces is a computationally hard problem, but it also holds the potential to automatically discover completely novel and performant architectures. To achieve this we need effective search algorithms that can identify powerful components and reuse them in new candidate architectures. In this paper, we introduce two adapted variants of the Smith-Waterman algorithm for local sequence alignment and use them to compute the edit distance in a grammar-based evolutionary architecture search. These algorithms enable us to efficiently calculate a distance metric for neural architectures and to generate a set of hybrid offspring from two parent models. This facilitates the deployment of crossover-based search heuristics, allows us to perform a thorough analysis on the architectural loss landscape, and track population diversity during search. We highlight how our method vastly improves computational complexity over previous work and enables us to efficiently compute shortest paths between architectures. When instantiating the crossover in evolutionary searches, we achieve competitive results, outperforming competing methods. Future work can build upon this new tool, discovering novel components that can be used more broadly across neural architecture design, and broadening its applications beyond NAS.

Paper Structure

This paper contains 37 sections, 8 equations, 13 figures, 4 tables, 8 algorithms.

Figures (13)

  • Figure 1: Example $dists$ and $paths$ matrices overlaid. Lighter coloured cells denote a higher distance from the first parent model, shown on top. Moving towards the right entails deleting a node from the first model, moving downwards represents the addition of a node from the second model, and moving diagonally corresponds to a node substitution. Dashed lines signal the closure of branching and routing nodes. The optimal mutation path is traced back in thicker lines, whose brightness reflects the weight of each operation, being brightest intensities a cost of 1, and darker ones reaching a cost of 0.
  • Figure 2: Search results comparing STX, CSWX and RCSWX-based evolutionary search with mutation-only searches. The plots show the mean across five seeds and the standard error of the mean. To assess the average performance, we normalise the results based on the lowest and highest attained performance within each dataset and combine them into a single plot by averaging (bottom right).
  • Figure 3: Runtime (s, log scale) against node count for RCSWX, CSWX and SEPX. Scatter markers show raw measurements. Solid lines are global fits: power-law for (R)CSWX and exponential for SEPX, with shaded regions showing adaptive error bounds from local log-residual standard deviations. Extrapolated fits suggest increasing runtime with model complexity. Bottom plot shows the realistic distribution of node counts we see in our searches, highlighting the unfeasibility of SEPX in this setting.
  • Figure 4: Smoothness and diversity analysis of the architectural search space and search population. (a) and (d) UMAP projections show fragmented global structure but local continuity with some very low performing architectures sprinkled across the biggest clusters. (b) Semivariogram indicates that smoothness in the CIFAR10 space extends to a range of $\sim 25$ edits, beyond which fitness correlation is no longer observable, while (e) semivariogram shows that the distance between two architectures in the Isabella space does not seem to affect how their fitness correlate. (c) Scatter plots of the diversity of the population across the search iterations, measured by the average distance between all pairs of architectures.
  • Figure 5: Illustration of subtree crossover. Parent architectures $\bm{P_1}$ and $\bm{P_2}$ swap selected subtrees (highlighted), generating two offspring architectures $\bm{C_1}$ and $\bm{C_2}$.
  • ...and 8 more figures