Table of Contents
Fetching ...

Peer-to-Peer Energy Trading in Dairy Farms using Multi-Agent Reinforcement Learning

Mian Ibad Ali Shah, Marcos Eduardo Cruz Victorio, Maeve Duffy, Enda Barrett, Karl Mason

TL;DR

The paper tackles high tariff-induced energy costs in rural dairy farms by deploying multi-agent reinforcement learning (MARL) to enable distributed P2P energy trading. It combines PPO and DQN with a price-advisor and a double-auction market to learn dairy-farm bidding strategies under realistic load, generation, and battery constraints. Key contributions include a fully MARL-based MAPDES framework, SDR-based price formation, and an ablated analysis showing the price advisor and dairy-specific constraints as critical to performance, with DQN delivering the strongest cost reductions and revenue gains while reducing peak demand. The results demonstrate significant economic and grid-support benefits of P2P trading in dairy communities, with robust cross-country generalization (Ireland to Finland) and clear guidance for scalable, privacy-preserving market design in rural energy systems.

Abstract

The integration of renewable energy resources in rural areas, such as dairy farming communities, enables decentralized energy management through Peer-to-Peer (P2P) energy trading. This research highlights the role of P2P trading in efficient energy distribution and its synergy with advanced optimization techniques. While traditional rule-based methods perform well under stable conditions, they struggle in dynamic environments. To address this, Multi-Agent Reinforcement Learning (MARL), specifically Proximal Policy Optimization (PPO) and Deep Q-Networks (DQN), is combined with community/distributed P2P trading mechanisms. By incorporating auction-based market clearing, a price advisor agent, and load and battery management, the approach achieves significant improvements. Results show that, compared to baseline models, DQN reduces electricity costs by 14.2% in Ireland and 5.16% in Finland, while increasing electricity revenue by 7.24% and 12.73%, respectively. PPO achieves the lowest peak hour demand, reducing it by 55.5% in Ireland, while DQN reduces peak hour demand by 50.0% in Ireland and 27.02% in Finland. These improvements are attributed to both MARL algorithms and P2P energy trading, which together results in electricity cost and peak hour demand reduction, and increase electricity selling revenue. This study highlights the complementary strengths of DQN, PPO, and P2P trading in achieving efficient, adaptable, and sustainable energy management in rural communities.

Peer-to-Peer Energy Trading in Dairy Farms using Multi-Agent Reinforcement Learning

TL;DR

The paper tackles high tariff-induced energy costs in rural dairy farms by deploying multi-agent reinforcement learning (MARL) to enable distributed P2P energy trading. It combines PPO and DQN with a price-advisor and a double-auction market to learn dairy-farm bidding strategies under realistic load, generation, and battery constraints. Key contributions include a fully MARL-based MAPDES framework, SDR-based price formation, and an ablated analysis showing the price advisor and dairy-specific constraints as critical to performance, with DQN delivering the strongest cost reductions and revenue gains while reducing peak demand. The results demonstrate significant economic and grid-support benefits of P2P trading in dairy communities, with robust cross-country generalization (Ireland to Finland) and clear guidance for scalable, privacy-preserving market design in rural energy systems.

Abstract

The integration of renewable energy resources in rural areas, such as dairy farming communities, enables decentralized energy management through Peer-to-Peer (P2P) energy trading. This research highlights the role of P2P trading in efficient energy distribution and its synergy with advanced optimization techniques. While traditional rule-based methods perform well under stable conditions, they struggle in dynamic environments. To address this, Multi-Agent Reinforcement Learning (MARL), specifically Proximal Policy Optimization (PPO) and Deep Q-Networks (DQN), is combined with community/distributed P2P trading mechanisms. By incorporating auction-based market clearing, a price advisor agent, and load and battery management, the approach achieves significant improvements. Results show that, compared to baseline models, DQN reduces electricity costs by 14.2% in Ireland and 5.16% in Finland, while increasing electricity revenue by 7.24% and 12.73%, respectively. PPO achieves the lowest peak hour demand, reducing it by 55.5% in Ireland, while DQN reduces peak hour demand by 50.0% in Ireland and 27.02% in Finland. These improvements are attributed to both MARL algorithms and P2P energy trading, which together results in electricity cost and peak hour demand reduction, and increase electricity selling revenue. This study highlights the complementary strengths of DQN, PPO, and P2P trading in achieving efficient, adaptable, and sustainable energy management in rural communities.

Paper Structure

This paper contains 31 sections, 33 equations, 6 figures, 11 tables, 3 algorithms.

Figures (6)

  • Figure 1: Process flow of MARL MAPDES simulator
  • Figure 2: Flowchart of the simulator
  • Figure 3: Reward Convergence for Dairy Farm Agents over 2.5M episodes
  • Figure 4: Typical daily patterns of battery SOC (PPO & DQN), load, and generation, averaged across all farms over a year
  • Figure 5: Comparative Percentage Results Across Models: Rule-based, RB+DQN, DQN, and PPO
  • ...and 1 more figures