Table of Contents
Fetching ...

Dynamic Feature-based Deep Reinforcement Learning for Flow Control of Circular Cylinder with Sparse Surface Pressure Sensing

Qiulei Wang, Lei Yan, Gang Hu, Wenli Chen, Jean Rabault, Bernd R. Noack

TL;DR

Dynamic Feature-Based DRL (DF-DRL) addresses the challenge of active flow control around a circular cylinder using sparse surface pressure sensing by lifting temporal sensor histories into a dynamic feature space, enabling a SAC-based policy to learn effective control without a full flow model. Compared with vanilla DRL, DF-DRL reduces the drag coefficient about 25% more than direct-sensor baselines and achieves comparable or better drag reductions with as few as one surface sensor, while also mitigating lift fluctuations. In 2D flows, drag reductions reach 32.2% at $Re=500$ and 46.55% at $Re=1000$; in a high-Re 3D case ($Re=10^4$), the DF-DRL single-sensor setup yields about 28.6% drag reduction, with improved wake coherence under jet actuation. Overall, the approach demonstrates that sparse sensing, combined with dynamic feature lifting, can deliver robust, high-performance AFC across flow regimes and supports practical experimental deployment and future MIMO extensions.

Abstract

This study proposes a self-learning algorithm for closed-loop cylinder wake control targeting lower drag and lower lift fluctuations with the additional challenge of sparse sensor information, taking deep reinforcement learning as the starting point. DRL performance is significantly improved by lifting the sensor signals to dynamic features (DF), which predict future flow states. The resulting dynamic feature-based DRL (DF-DRL) automatically learns a feedback control in the plant without a dynamic model. Results show that the drag coefficient of the DF-DRL model is 25% less than the vanilla model based on direct sensor feedback. More importantly, using only one surface pressure sensor, DF-DRL can reduce the drag coefficient to a state-of-the-art performance of about 8% at Re = 100 and significantly mitigate lift coefficient fluctuations. Hence, DF-DRL allows the deployment of sparse sensing of the flow without degrading the control performance. This method also shows good robustness in controlling flow under higher Reynolds numbers, which reduces the drag coefficient by 32.2% and 46.55% at Re = 500 and 1000, respectively, indicating the broad applicability of the method. Since surface pressure information is more straightforward to measure in realistic scenarios than flow velocity information, this study provides a valuable reference for experimentally designing the active flow control of a circular cylinder based on wall pressure signals, which is an essential step toward further developing intelligent control in realistic multi-input multi-output (MIMO) system.

Dynamic Feature-based Deep Reinforcement Learning for Flow Control of Circular Cylinder with Sparse Surface Pressure Sensing

TL;DR

Dynamic Feature-Based DRL (DF-DRL) addresses the challenge of active flow control around a circular cylinder using sparse surface pressure sensing by lifting temporal sensor histories into a dynamic feature space, enabling a SAC-based policy to learn effective control without a full flow model. Compared with vanilla DRL, DF-DRL reduces the drag coefficient about 25% more than direct-sensor baselines and achieves comparable or better drag reductions with as few as one surface sensor, while also mitigating lift fluctuations. In 2D flows, drag reductions reach 32.2% at and 46.55% at ; in a high-Re 3D case (), the DF-DRL single-sensor setup yields about 28.6% drag reduction, with improved wake coherence under jet actuation. Overall, the approach demonstrates that sparse sensing, combined with dynamic feature lifting, can deliver robust, high-performance AFC across flow regimes and supports practical experimental deployment and future MIMO extensions.

Abstract

This study proposes a self-learning algorithm for closed-loop cylinder wake control targeting lower drag and lower lift fluctuations with the additional challenge of sparse sensor information, taking deep reinforcement learning as the starting point. DRL performance is significantly improved by lifting the sensor signals to dynamic features (DF), which predict future flow states. The resulting dynamic feature-based DRL (DF-DRL) automatically learns a feedback control in the plant without a dynamic model. Results show that the drag coefficient of the DF-DRL model is 25% less than the vanilla model based on direct sensor feedback. More importantly, using only one surface pressure sensor, DF-DRL can reduce the drag coefficient to a state-of-the-art performance of about 8% at Re = 100 and significantly mitigate lift coefficient fluctuations. Hence, DF-DRL allows the deployment of sparse sensing of the flow without degrading the control performance. This method also shows good robustness in controlling flow under higher Reynolds numbers, which reduces the drag coefficient by 32.2% and 46.55% at Re = 500 and 1000, respectively, indicating the broad applicability of the method. Since surface pressure information is more straightforward to measure in realistic scenarios than flow velocity information, this study provides a valuable reference for experimentally designing the active flow control of a circular cylinder based on wall pressure signals, which is an essential step toward further developing intelligent control in realistic multi-input multi-output (MIMO) system.
Paper Structure (19 sections, 16 equations, 18 figures, 4 tables, 1 algorithm)

This paper contains 19 sections, 16 equations, 18 figures, 4 tables, 1 algorithm.

Figures (18)

  • Figure 1: Schematic of the DF-DRL (SAC) framework used in the present study. The term wrapper refers to the process of encapsulating actions from the agent and sending them to the OpenFOAM solver. In contrast, extractor refers to the process of parsing CFD results and providing feedback to the agent. This framework is derived from the DRLinFluids package wang2022drlinfluids.
  • Figure 2: Flowchart of the two approaches for the state collected from the environment. (a) Vanilla method (Sensor-feedback): the agent only collects the flow field state at a single time step. For example, the signal obtained from four pressure sensors located in the flow field at the 40th time step returns a state vector $s\in\mathbb{R}^{4\times 1}$; (b) DF-DRL method: the agent collects data from the most recent thirty time steps $t$, including historical sensor pressure $p\in\mathbb{R}^{30\times 1}$ and action data $a\in\mathbb{R}^{30\times 1}$ provided by the agent. This process indicates dynamic feature lifting and the dimension of the state vector $\boldsymbol{S}\in\mathbb{R}^{30\times 2}$. Moreover, scaling the state vector will amplify signal fluctuations, which is helpful in capturing the flow characteristics.
  • Figure 3: Description of (a) the numerical setup, which is adapted from schafer1996benchmark. The origin of the coordinates is located at the lower left corner of the entire computational domain. $\Gamma_{in}$ stands the inflow velocity with a parabolic flow profile, while $\Gamma_{out}$ is put for the outflow. Non-slip wall boundary constraint $\Gamma_W$ is applied on the bottom and top of the channel and on the cylinder surfaces. Two jet holes ($\Gamma_1$ and $\Gamma_2$) are present at both sides of the cylinder; (b) An enlarged view of the dashed box in subfigure (a). The $0^\circ$ azimuth angle corresponds to the foremost point on the cylinder's windward surface and increases clockwise. The jet actuator opening angle is $10^\circ$, consistent with Rabault2019Artificial.
  • Figure 4: Number and configuration of the sensors used to generate the DRL controller state observation: (a) Using 147 sensors provides sufficient flow information for DRL training, stated as $\mathcal{L}_{I}^{}$; (b) Layout using 4, 8, 12, 24, and 36 sensors in the surface of the cylinder respectively, denoted by $\mathcal{L}_{II}^{4}$, $\mathcal{L}_{II}^{8}$, $\mathcal{L}_{II}^{12}$, $\mathcal{L}_{II}^{24}$, and $\mathcal{L}_{II}^{36}$; (c) Layout using only one sensor located on the surface of the cylinder with an azimuth angle of $0^{\circ}$, $30^{\circ}$, $60^{\circ}$, $90^{\circ}$, and $\theta^{\circ}$, signified by $\mathcal{L}_{III}^{0}$, $\mathcal{L}_{III}^{30}$, $\mathcal{L}_{III}^{60}$, $\mathcal{L}_{III}^{90}$, and $\mathcal{L}_{III}^{\theta}$.
  • Figure 5: Comparison of (a) the mean $C_D$, (b) reward, and (c) the std of $C_L$ when using different DRL methods, i.e., using the DF-DRL method or not, and a different number of the sensor (4 sensors in the surface of cylinder and 147 sensors around the cylinder). The learning case condition, which contains 147 sensors without time series, reaches the maximum reward. These cases are trained three times repeatedly, and we present the average between the three runs (thick line) and the std between these runs (shadowed area).
  • ...and 13 more figures