Table of Contents
Fetching ...

Improving cosmological reach of a gravitational wave observatory using Deep Loop Shaping

Jonas Buchli, Brendan Tracey, Tomislav Andric, Christopher Wipf, Yu Him Justin Chiu, Matthias Lochbrunner, Craig Donner, Rana X. Adhikari, Jan Harms, Iain Barr, Roland Hafner, Andrea Huber, Abbas Abdolmaleki, Charlie Beattie, Joseph Betzwieser, Serkan Cabi, Jonas Degrave, Yuzhu Dong, Leslie Fritz, Anchal Gupta, Oliver Groth, Sandy Huang, Tamara Norman, Hannah Openshaw, Jameson Rollins, Greg Thornton, George Van Den Driessche, Markus Wulfmeier, Pushmeet Kohli, Martin Riedmiller, LIGO Instrument Team

TL;DR

This work tackles the limited low-frequency sensitivity of gravitational-wave detectors by addressing control-noise injection in the LIGO angular suspension system. It introduces Deep Loop Shaping (DLS), a reinforcement-learning framework that optimizes frequency-domain rewards to shape closed-loop behavior, and demonstrates its application to the challenging common-hard-pitch loop via a distributed actor-critic setup. The learned policies on the LLO demonstrator achieve substantial reductions in control-noise within the 10–30 Hz GW observation band (up to two orders of magnitude) while preserving control authority at lower frequencies and staying below the quantum back-action limit. The results, including sim2real validation and Caltech 40 m IMC tests, indicate that DLS can meaningfully extend current and future GW observatories' cosmological reach and has broader applicability to complex, frequency-constrained control problems.

Abstract

Improved low-frequency sensitivity of gravitational wave observatories would unlock study of intermediate-mass black hole mergers, binary black hole eccentricity, and provide early warnings for multi-messenger observations of binary neutron star mergers. Today's mirror stabilization control injects harmful noise, constituting a major obstacle to sensitivity improvements. We eliminated this noise through Deep Loop Shaping, a reinforcement learning method using frequency domain rewards. We proved our methodology on the LIGO Livingston Observatory (LLO). Our controller reduced control noise in the 10--30Hz band by over 30x, and up to 100x in sub-bands surpassing the design goal motivated by the quantum limit. These results highlight the potential of Deep Loop Shaping to improve current and future GW observatories, and more broadly instrumentation and control systems.

Improving cosmological reach of a gravitational wave observatory using Deep Loop Shaping

TL;DR

This work tackles the limited low-frequency sensitivity of gravitational-wave detectors by addressing control-noise injection in the LIGO angular suspension system. It introduces Deep Loop Shaping (DLS), a reinforcement-learning framework that optimizes frequency-domain rewards to shape closed-loop behavior, and demonstrates its application to the challenging common-hard-pitch loop via a distributed actor-critic setup. The learned policies on the LLO demonstrator achieve substantial reductions in control-noise within the 10–30 Hz GW observation band (up to two orders of magnitude) while preserving control authority at lower frequencies and staying below the quantum back-action limit. The results, including sim2real validation and Caltech 40 m IMC tests, indicate that DLS can meaningfully extend current and future GW observatories' cosmological reach and has broader applicability to complex, frequency-constrained control problems.

Abstract

Improved low-frequency sensitivity of gravitational wave observatories would unlock study of intermediate-mass black hole mergers, binary black hole eccentricity, and provide early warnings for multi-messenger observations of binary neutron star mergers. Today's mirror stabilization control injects harmful noise, constituting a major obstacle to sensitivity improvements. We eliminated this noise through Deep Loop Shaping, a reinforcement learning method using frequency domain rewards. We proved our methodology on the LIGO Livingston Observatory (LLO). Our controller reduced control noise in the 10--30Hz band by over 30x, and up to 100x in sub-bands surpassing the design goal motivated by the quantum limit. These results highlight the potential of Deep Loop Shaping to improve current and future GW observatories, and more broadly instrumentation and control systems.

Paper Structure

This paper contains 41 sections, 16 equations, 18 figures, 3 tables.

Figures (18)

  • Figure 1: Cosmological reach and strain noise from control. (A) The plot shows the volume in space explored with binary black hole merger waveforms khanFrequencydomainGravitationalWaves2016 for different cases of technical noise. The x-axis in (A) is the total mass of the equal-mass binary pair. This corresponds to the x-axis in (B), the frequency of the first quasi-normal mode of a Schwarzschild black hole with such a mass, as measured in the source frame. The purple trace shows the reach of LIGO as of March 2024. The green trace shows the volumetric improvement in the case where the technical noise is removed entirely. Many of the known technical noise sources are linked to controls. (B) LIGO's noise budget and controller performance. Purple: overall measured strain noise, red: strain noise contribution from currently operational linear controller for $\theta_{CHP}$, blue: strain noise contribution from RL policy as run on the LIGO Livingston Observatory on Dec 5, 2024 (mean, 10 and 90 Dashed green indicates the control design goal derived from the quantum back-action limit by applying a design margin of 10x; the control noise should drop below this curve. A detailed accounting of technical noise sources is available in O4InstrumentPaper.
  • Figure 2: Deep Loop Shaping -- Reinforcement learning with Frequency Domain Rewards (A) (1) A model is identified from plant measurements. (2) The identified model is used as a learning environment. Frequency-domain rewards are used to compute rewards. (3) The optimized control policy is deployed on the plant. (B) Illustration of the frequency rewards and the multiplicative scoring.
  • Figure S1: Detailed illustration of Deep Loop Shaping. (1) A model is identified from plant measurements (Section\ref{['s:modelling']}) (2) The identified model is used as a learning environment in a distributed actor-critic setup (Section \ref{['s:ML']}). We show details of the learning environment, and an illustration of frequency-domain rewards to compute intermediate reward r(t) in three main steps: apply a filter, score with sigmoid, multiply for soft-AND (Section \ref{['s:a_rewards']}). (3) The optimized control policy is exported for hard-real-time deployment on the plant using code generation.
  • Figure S2: LIGO Arm cavity and mirror stages. The diagram shows the dual-recycled Fabry–Perot Michelson interferometer (IFO) layout with the laser beams, the simulated optomechanical system with mirror suspension system, and the integrated ASC system. ITMX and ITMY represent the input test masses (ITMs) for the x- and y-arms, respectively. Fabry-Perot cavities (FPCs) are established by the ITMs and end test masses (ETMs) in each arm. Additionally, the power-recycling (PRM) and signal-recycling mirrors (SRM), along with the beam splitter (BS) and ITMs, form the power-recycling cavity (PRC) and signal-recycling cavity (SRC), respectively. The Faraday Isolator (FI) extracts the interferometer reflection and sends it to a set of wavefront sensors (WFSs). Common hard ($\theta_{CHP}$) motion mode is shown. It is one of the geometric modes of the mirrors that form the basis of the control modes. Gravitational wave observatories measure changes in the gravitational field by detecting changes in the interference of a laser beam split into two orthogonal arms of an interferometer. In the top right box, the diagram illustrates the simulated optomechanical system, consisting of the high-power cavity laser beam, the main input noises, the schematic of the quadruple pendulum stage (QUAD) in LIGO, and the associated control system. QUAD is suspended from the Internal Seismic Isolation (ISI) platform. The ASC signal is fed back to the PUM stage. This image is not to scale.
  • Figure S3: (left) Angle-to-length coupling due to beam-spot miscentering and visualization of the roll-pitch-yaw angles for the TM (right) Illustration of the cavity angle and related geometry used to derive control specifications.
  • ...and 13 more figures