Table of Contents
Fetching ...

Interference-free Operating System: A 6 Years' Experience in Mitigating Cross-Core Interference in Linux

Zhaomeng Deng, Ziqi Zhang, Ding Li, Yao Guo, Yunfeng Ye, Yuxin Ren, Ning Jia, Xinwei Hu

TL;DR

The paper investigates cross-core interference originating from the Linux kernel in multi-core real-time contexts and documents six years of industrial practice to mitigate it. It identifies fragmentation in existing isolation mechanisms and advocates a unified, partition-aware design with explicit core indicators, isolation-friendly synchronization, and verifiable programming practices. The authors report fixing 34 cross-core bugs and merging numerous patches, achieving substantial improvements in worst-case jitter and schedulability, with openEuler delivering up to 8.7x better worst-case latency performance and 11.5x schedulability gains over vanilla Linux. The work demonstrates significant end-to-end benefits for real-time systems like cFS and ROS2 and emphasizes practical guidance for developers, system designers, and researchers to systematically eliminate OS-induced interference in production environments.

Abstract

Real-time operating systems employ spatial and temporal isolation to guarantee predictability and schedulability of real-time systems on multi-core processors. Any unbounded and uncontrolled cross-core performance interference poses a significant threat to system time safety. However, the current Linux kernel has a number of interference issues and represents a primary source of interference. Unfortunately, existing research does not systematically and deeply explore the cross-core performance interference issue within the OS itself. This paper presents our industry practice for mitigating cross-core performance interference in Linux over the past 6 years. We have fixed dozens of interference issues in different Linux subsystems. Compared to the version without our improvements, our enhancements reduce the worst-case jitter by a factor of 8.7, resulting in a maximum 11.5x improvement over system schedulability. For the worst-case latency in the Core Flight System and the Robot Operating System 2, we achieve a 1.6x and 1.64x reduction over RT-Linux. Based on our development experience, we summarize the lessons we learned and offer our suggestions to system developers for systematically eliminating cross-core interference from the following aspects: task management, resource management, and concurrency management. Most of our modifications have been merged into Linux upstream and released in commercial distributions.

Interference-free Operating System: A 6 Years' Experience in Mitigating Cross-Core Interference in Linux

TL;DR

The paper investigates cross-core interference originating from the Linux kernel in multi-core real-time contexts and documents six years of industrial practice to mitigate it. It identifies fragmentation in existing isolation mechanisms and advocates a unified, partition-aware design with explicit core indicators, isolation-friendly synchronization, and verifiable programming practices. The authors report fixing 34 cross-core bugs and merging numerous patches, achieving substantial improvements in worst-case jitter and schedulability, with openEuler delivering up to 8.7x better worst-case latency performance and 11.5x schedulability gains over vanilla Linux. The work demonstrates significant end-to-end benefits for real-time systems like cFS and ROS2 and emphasizes practical guidance for developers, system designers, and researchers to systematically eliminate OS-induced interference in production environments.

Abstract

Real-time operating systems employ spatial and temporal isolation to guarantee predictability and schedulability of real-time systems on multi-core processors. Any unbounded and uncontrolled cross-core performance interference poses a significant threat to system time safety. However, the current Linux kernel has a number of interference issues and represents a primary source of interference. Unfortunately, existing research does not systematically and deeply explore the cross-core performance interference issue within the OS itself. This paper presents our industry practice for mitigating cross-core performance interference in Linux over the past 6 years. We have fixed dozens of interference issues in different Linux subsystems. Compared to the version without our improvements, our enhancements reduce the worst-case jitter by a factor of 8.7, resulting in a maximum 11.5x improvement over system schedulability. For the worst-case latency in the Core Flight System and the Robot Operating System 2, we achieve a 1.6x and 1.64x reduction over RT-Linux. Based on our development experience, we summarize the lessons we learned and offer our suggestions to system developers for systematically eliminating cross-core interference from the following aspects: task management, resource management, and concurrency management. Most of our modifications have been merged into Linux upstream and released in commercial distributions.

Paper Structure

This paper contains 28 sections, 10 figures, 5 tables.

Figures (10)

  • Figure 1: Latency distribution of cyclictest on an idle system.
  • Figure 2: System model.
  • Figure 3: Cross-core interference of workqueue.
  • Figure 4: The process of inter-core interference caused by ASID exhaustion and TLB refresh.
  • Figure 5: interference captured on an idle isolated core.
  • ...and 5 more figures