Table of Contents
Fetching ...

Multi-Objective Deep Reinforcement Learning for Optimisation in Autonomous Systems

Juan C. Rosero, Ivana Dusparic, Nicolás Cardozo

TL;DR

This work uses a MORL technique called Deep W-Learning (DWN) and applies it to the Emergent Web Servers exemplar, a self-adaptive server, to find the optimal configuration for runtime performance optimization, and compares DWN to two single-objective optimization implementations.

Abstract

Reinforcement Learning (RL) is used extensively in Autonomous Systems (AS) as it enables learning at runtime without the need for a model of the environment or predefined actions. However, most applications of RL in AS, such as those based on Q-learning, can only optimize one objective, making it necessary in multi-objective systems to combine multiple objectives in a single objective function with predefined weights. A number of Multi-Objective Reinforcement Learning (MORL) techniques exist but they have mostly been applied in RL benchmarks rather than real-world AS systems. In this work, we use a MORL technique called Deep W-Learning (DWN) and apply it to the Emergent Web Servers exemplar, a self-adaptive server, to find the optimal configuration for runtime performance optimization. We compare DWN to two single-objective optimization implementations: ε-greedy algorithm and Deep Q-Networks. Our initial evaluation shows that DWN optimizes multiple objectives simultaneously with similar results than DQN and ε-greedy approaches, having a better performance for some metrics, and avoids issues associated with combining multiple objectives into a single utility function.

Multi-Objective Deep Reinforcement Learning for Optimisation in Autonomous Systems

TL;DR

This work uses a MORL technique called Deep W-Learning (DWN) and applies it to the Emergent Web Servers exemplar, a self-adaptive server, to find the optimal configuration for runtime performance optimization, and compares DWN to two single-objective optimization implementations.

Abstract

Reinforcement Learning (RL) is used extensively in Autonomous Systems (AS) as it enables learning at runtime without the need for a model of the environment or predefined actions. However, most applications of RL in AS, such as those based on Q-learning, can only optimize one objective, making it necessary in multi-objective systems to combine multiple objectives in a single objective function with predefined weights. A number of Multi-Objective Reinforcement Learning (MORL) techniques exist but they have mostly been applied in RL benchmarks rather than real-world AS systems. In this work, we use a MORL technique called Deep W-Learning (DWN) and apply it to the Emergent Web Servers exemplar, a self-adaptive server, to find the optimal configuration for runtime performance optimization. We compare DWN to two single-objective optimization implementations: ε-greedy algorithm and Deep Q-Networks. Our initial evaluation shows that DWN optimizes multiple objectives simultaneously with similar results than DQN and ε-greedy approaches, having a better performance for some metrics, and avoids issues associated with combining multiple objectives into a single utility function.
Paper Structure (10 sections, 7 equations, 5 figures, 3 tables)

This paper contains 10 sections, 7 equations, 5 figures, 3 tables.

Figures (5)

  • Figure 1: Working of a DWN network
  • Figure 2: Comparison of the evolution of avg. response time
  • Figure 3: Comparison of the evolution of Cost
  • Figure 4: Behavior of DWN in avg. response time
  • Figure 5: Behavior of DWN in cost