Table of Contents
Fetching ...

Solvers for the Hermitian and the pseudo-Hermitian Bethe-Salpeter equation in the Yambo code: Implementation and Performance

Petru Milev, Blanca Mellado-Pinto, Muralidhar Nalabothula, Ali Esquembre Kucukalic, Fernando Alvarruiz, Enrique Ramos, Francesco Filippone, Alejandro Molina-Sanchez, Ludger Wirtz, Jose E. Roman, Davide Sangalli

TL;DR

This work addresses solving the Bethe–Salpeter equation (BSE) as a structured eigenproblem by evaluating two solver paradigms—exact diagonalization and iterative SLEPc methods—implemented in the Yambo code and interfaced with ScaLAPACK, ELPA, and SLEPc. It exploits the $ abla$-pseudo‑Hermitian structure via the $\\ ext{Omega}$ operator to transform the coupling case into efficiently solvable forms, achieving substantial speedups and memory benefits. The study provides detailed CPU and GPU performance analyses up to matrices with $N \approx 10^5$, demonstrating that pseudo‑Hermitian solvers can render the coupling case nearly as efficient as the resonant case and that library‑based solvers can overcome the solver barrier for large BSE matrices. The findings have practical impact by enabling large‑scale optical property calculations in condensed matter systems, with concrete guidance on when to prefer direct diagonalization, iterative methods, and PH‑aware implementations. The work also outlines integration strategies with multiple HPC libraries and highlights future prospects for magma/cuSolver integrations.

Abstract

We analyze the performance of two strategies in solving the structured eigenvalue problem deriving from the Bethe-Salpeter equation (BSE) in condensed matter physics. The BSE matrix is constructed with the Yambo code, and the two strategies are implemented by interfacing Yambo with the ScaLAPACK and ELPA libraries for direct diagonalization, and with the SLEPc library for the iterative approach. We consider both the Hermitian (Tamm-Dancoff approximation) and pseudo-Hermitian forms, addressing dense matrices of three different sizes. A description of the implementation is also provided, with details for the pseudo-Hermitian case. Timing and memory utilization are analyzed on both CPU and GPU clusters. Our results demonstrate that it is now feasible to handle dense BSE matrices of the order of 10^5.

Solvers for the Hermitian and the pseudo-Hermitian Bethe-Salpeter equation in the Yambo code: Implementation and Performance

TL;DR

This work addresses solving the Bethe–Salpeter equation (BSE) as a structured eigenproblem by evaluating two solver paradigms—exact diagonalization and iterative SLEPc methods—implemented in the Yambo code and interfaced with ScaLAPACK, ELPA, and SLEPc. It exploits the -pseudo‑Hermitian structure via the operator to transform the coupling case into efficiently solvable forms, achieving substantial speedups and memory benefits. The study provides detailed CPU and GPU performance analyses up to matrices with , demonstrating that pseudo‑Hermitian solvers can render the coupling case nearly as efficient as the resonant case and that library‑based solvers can overcome the solver barrier for large BSE matrices. The findings have practical impact by enabling large‑scale optical property calculations in condensed matter systems, with concrete guidance on when to prefer direct diagonalization, iterative methods, and PH‑aware implementations. The work also outlines integration strategies with multiple HPC libraries and highlights future prospects for magma/cuSolver integrations.

Abstract

We analyze the performance of two strategies in solving the structured eigenvalue problem deriving from the Bethe-Salpeter equation (BSE) in condensed matter physics. The BSE matrix is constructed with the Yambo code, and the two strategies are implemented by interfacing Yambo with the ScaLAPACK and ELPA libraries for direct diagonalization, and with the SLEPc library for the iterative approach. We consider both the Hermitian (Tamm-Dancoff approximation) and pseudo-Hermitian forms, addressing dense matrices of three different sizes. A description of the implementation is also provided, with details for the pseudo-Hermitian case. Timing and memory utilization are analyzed on both CPU and GPU clusters. Our results demonstrate that it is now feasible to handle dense BSE matrices of the order of 10^5.

Paper Structure

This paper contains 14 sections, 8 equations, 4 figures, 8 tables.

Figures (4)

  • Figure 1: Time complexity (panels (a)–(c)) and memory complexity (panels (d)–(f)) for solving the BSE eigenvalue problem using a single CPU. Results are shown for: (a,d) the Hermitian case, (b,e) the coupling case with a non-Hermitian algorithm, and (c,f) the coupling case with a pseudo-Hermitian algorithm. In all cases, $N$ denotes the size of the resonant block. Simulations were carried out on the ISM cluster.
  • Figure 2: Comparison of time complexity for the studied algorithms, with the x-axis representing the actual matrix size: $N$ for the resonant case and $2N$ for the coupling case.
  • Figure 3: Execution time (panels (a) and (b)), efficiency (panels (c) and (d)), and memory usage (panels (e) and (f)) for different solvers applied to the resonant and coupling cases (pseudo-Hermitian solver) of the BSE eigenvalue problem. CPU case.
  • Figure 4: Execution time (panels (a) and (b)), efficiency (panels (c) and (d)), and memory usage (panels (e) and (f)) for different solvers applied to the resonant and coupling cases (pseudo-Hermitian solver) of the BSE eigenvalue problem. GPU case. Only host memory is reported here.