Table of Contents
Fetching ...

The Impact of Move Schemes on Simulated Annealing Performance

Ruichen Xu, Haochun Wang, Yuefan Deng

TL;DR

The paper addresses how move schemes in Simulated Annealing affect performance when the total move variance is fixed. By modeling SA as an MCMC process and focusing on partial-coordinate updates, the authors show that updating a smaller subset of coordinates can maintain reasonable acceptance and accelerate mixing in high dimensions. They derive a cumulant-based theoretical framework and validate it with extensive simulations on Lennard-Jones clusters, the Rosenbrock function, and hyperelliptic-like landscapes, demonstrating practical improvements in convergence and accuracy. The work provides actionable guidelines for designing SA proposals, with potential impact on large-scale optimization tasks in physics, chemistry, and machine learning.

Abstract

Designing an effective move-generation function for Simulated Annealing (SA) in complex models remains a significant challenge. In this work, we present a combination of theoretical analysis and numerical experiments to examine the impact of various move-generation parameters -- such as how many particles are moved and by what distance at each iteration -- under different temperature schedules and system sizes. Our numerical studies, carried out on both the Lennard-Jones problem and an additional benchmark, reveal that moving exactly one randomly chosen particle per iteration offers the most efficient performance. We analyze acceptance rates, exploration properties, and convergence behavior, providing evidence that partial-coordinate updates can outperform full-coordinate moves in certain high-dimensional settings. These findings offer practical guidelines for optimizing SA methods in a broad range of complex optimization tasks.

The Impact of Move Schemes on Simulated Annealing Performance

TL;DR

The paper addresses how move schemes in Simulated Annealing affect performance when the total move variance is fixed. By modeling SA as an MCMC process and focusing on partial-coordinate updates, the authors show that updating a smaller subset of coordinates can maintain reasonable acceptance and accelerate mixing in high dimensions. They derive a cumulant-based theoretical framework and validate it with extensive simulations on Lennard-Jones clusters, the Rosenbrock function, and hyperelliptic-like landscapes, demonstrating practical improvements in convergence and accuracy. The work provides actionable guidelines for designing SA proposals, with potential impact on large-scale optimization tasks in physics, chemistry, and machine learning.

Abstract

Designing an effective move-generation function for Simulated Annealing (SA) in complex models remains a significant challenge. In this work, we present a combination of theoretical analysis and numerical experiments to examine the impact of various move-generation parameters -- such as how many particles are moved and by what distance at each iteration -- under different temperature schedules and system sizes. Our numerical studies, carried out on both the Lennard-Jones problem and an additional benchmark, reveal that moving exactly one randomly chosen particle per iteration offers the most efficient performance. We analyze acceptance rates, exploration properties, and convergence behavior, providing evidence that partial-coordinate updates can outperform full-coordinate moves in certain high-dimensional settings. These findings offer practical guidelines for optimizing SA methods in a broad range of complex optimization tasks.

Paper Structure

This paper contains 11 sections, 54 equations, 5 figures, 3 tables, 2 algorithms.

Figures (5)

  • Figure 1: Violin plots of the relative errors, $(\mathrm{Best} - \mathrm{BestMin}) / \lvert \mathrm{BestMin} \rvert$, for Lennard-Jones systems of sizes $N=6,\,39,\,89$ at $\text{Step}=100{,}000$. Each subplot has a custom width ratio of $4{:}7{:}8$, and the horizontal axis shows $d$ (the number of coordinates moved). Colors (red, green, blue) indicate three inverse-variance settings $\sigma_{\mathrm{total}}^{-2}=200,\,100,\,10$. The vertical range is restricted to $[0,0.6]$, with labels $\,N=6,\,39,\,89$ placed in the upper region of each subplot. A single figure-level legend at the bottom summarizes the three variance cases.
  • Figure 2: Heatmap (left) and contour plot (right) for the 2D Rosenbrock function $f(x,y) = (1 - x)^2 + 100\,\bigl(y - x^2\bigr)^2$ with logarithmic color scale. The heatmap covers the domain $[-1.5,1.5]\times[-1.5,1.5]$, and a red rectangle indicates the smaller region $[0.5,1.5]\times[0.5,1.5]$ for the contour plot.
  • Figure 3: Comparison across three problem sizes (N=30, 72, and 200). The legend at the top shows different $d$ values for acceptance rate (solid lines) and best value (dashed lines).
  • Figure 4: Heatmap (left) and contour plot (right) for the hyperelliptic-like function $f(x,y) = 2x^2 + y^2$ in 2D. The heatmap spans the domain $[-2,2]\times[-2,2]$, and a red rectangle highlights the smaller region $[-0.2,0.2]\times[-0.2,0.2]$ on which the contour plot is drawn.
  • Figure 5: Comparison across three problem sizes (N=30, 72, and 200). The legend at the top shows different $d$ values for acceptance rate (solid lines) and best value (dashed lines).