Optimizing Downlink C-NOMA Transmission with Movable Antennas: A DDPG-based Approach
Ali Amhaz, Mohamed Elhattab, Chadi Assi, Sanaa Sharafeddine
TL;DR
This work investigates a downlink cooperative NOMA system in which movable antennas (MAs) at both users and a multi-antenna base station enable flexible beamforming and relaying. The problem is to maximize the sum rate by jointly optimizing BS beamformers, transmit power at the near user, and MA positions under QoS and channel randomness, formulated as a non-convex optimization. A deep deterministic policy gradient (DDPG) reinforcement-learning framework with Actor-Critic networks is employed to handle continuous state/action spaces, including a reward structure with constraint penalties. Results show substantial performance gains over MA-NOMA and fixed-antenna C-NOMA benchmarks (up to 45% and 60%, respectively) and an RL-approximation accuracy of about 93% relative to the optimal solution, highlighting the efficacy of combining MA with C-NOMA and RL for next-generation networks.
Abstract
This paper analyzes a downlink C-NOMA scenario where a base station (BS) is deployed to serve a pair of users equipped with movable antenna (MA) technology. The user with better channel conditions with the BS will be able to transmit the signal to the other user providing an extra transmission resource and enhancing performance. Both users are equipped with a receiving MA each and a transmitting MA for the relaying user. In this regard, we formulate an optimization problem with the objective of maximizing the achievable sum rate by jointly determining the beamforming vector at the BS, the transmit power at the device and the positions of the MAs while meeting the quality of service (QoS) constraints. Due to the non-convex structure of the formulated problem and the randomness in the channels we adopt a deep deterministic policy gradient (DDPG) approach, a reinforcement learning (RL) algorithm capable of dealing with continuous state and action spaces. Numerical results demonstrate the superiority of the presented model compared to the other benchmark schemes showing gains reaching 45% compared to the NOMA enabled MA scheme and 60% compared to C-NOMA model with fixed antennas. The solution approach showed 93% accuracy compared to the optimal solution.
