Table of Contents
Fetching ...

Analyzing Configuration Dependencies of File Systems

Tabassum Mahmud, Om Rameshwar Gatla, Duo Zhang, Carson Love, Ryan Bumann, Varun S Girimaji, Mai Zheng

TL;DR

The paper tackles the problem of configuration-related complexity in storage systems by empirically uncovering multilevel configuration dependencies across Ext4, XFS, and ZFS, and by building ConfD, an extensible framework with a core for dependency extraction and a plugin suite for detection. It leverages metadata-assisted taint analysis on LLVM IR to connect parameters across components via shared FS metadata, derives a taxonomy of Self Dependency, Cross-Parameter Dependency, and Cross-Component Dependency, and generates dependency-guided configuration states to enable targeted testing and regression analysis. The results show ConfD can extract around 160 multilevel dependencies with a low false-positive rate and that dependency-guided plugins uncover a broad range of issues (specification gaps, misconfigurations, and regression failures) while improving false-positive rates in regression testing by substantial margins. The work demonstrates the practical value of dependency-aware configuration analysis for FS ecosystems and suggests extensions to other storage systems (e.g., WiredTiger) and databases, with potential integration into CI pipelines and broader language support through LLVM.

Abstract

File systems play an essential role in modern society for managing precious data. To meet diverse needs, they often support many configuration parameters. Such flexibility comes at the price of additional complexity which can lead to subtle configuration-related issues. To address this challenge, we study the configuration-related issues of two major file systems (i.e., Ext4 and XFS) in depth, and identify a prevalent pattern called multilevel configuration dependencies. Based on the study, we build an extensible tool called ConfD to extract the dependencies automatically, and create a set of plugins to address different configuration-related issues. Our experiments on Ext4, XFS and a modern copy-on-write file system (i.e., ZFS) show that ConfD was able to extract 160 configuration dependencies for the file systems with a low false positive rate. Moreover, the dependency-guided plugins can identify various configuration issues (e.g., mishandling of configurations, regression test failures induced by valid configurations). In addition, we also explore the applicability of ConfD on a popular storage engine (i.e., WiredTiger). We hope that this comprehensive analysis of configuration dependencies of storage systems can shed light on addressing configuration-related challenges for the system community in general.

Analyzing Configuration Dependencies of File Systems

TL;DR

The paper tackles the problem of configuration-related complexity in storage systems by empirically uncovering multilevel configuration dependencies across Ext4, XFS, and ZFS, and by building ConfD, an extensible framework with a core for dependency extraction and a plugin suite for detection. It leverages metadata-assisted taint analysis on LLVM IR to connect parameters across components via shared FS metadata, derives a taxonomy of Self Dependency, Cross-Parameter Dependency, and Cross-Component Dependency, and generates dependency-guided configuration states to enable targeted testing and regression analysis. The results show ConfD can extract around 160 multilevel dependencies with a low false-positive rate and that dependency-guided plugins uncover a broad range of issues (specification gaps, misconfigurations, and regression failures) while improving false-positive rates in regression testing by substantial margins. The work demonstrates the practical value of dependency-aware configuration analysis for FS ecosystems and suggests extensions to other storage systems (e.g., WiredTiger) and databases, with potential integration into CI pipelines and broader language support through LLVM.

Abstract

File systems play an essential role in modern society for managing precious data. To meet diverse needs, they often support many configuration parameters. Such flexibility comes at the price of additional complexity which can lead to subtle configuration-related issues. To address this challenge, we study the configuration-related issues of two major file systems (i.e., Ext4 and XFS) in depth, and identify a prevalent pattern called multilevel configuration dependencies. Based on the study, we build an extensible tool called ConfD to extract the dependencies automatically, and create a set of plugins to address different configuration-related issues. Our experiments on Ext4, XFS and a modern copy-on-write file system (i.e., ZFS) show that ConfD was able to extract 160 configuration dependencies for the file systems with a low false positive rate. Moreover, the dependency-guided plugins can identify various configuration issues (e.g., mishandling of configurations, regression test failures induced by valid configurations). In addition, we also explore the applicability of ConfD on a popular storage engine (i.e., WiredTiger). We hope that this comprehensive analysis of configuration dependencies of storage systems can shed light on addressing configuration-related challenges for the system community in general.

Paper Structure

This paper contains 30 sections, 3 figures, 17 tables.

Figures (3)

  • Figure 1: A Configuration-Related Issue of Ext4. When sparse_super2 feature is enabled and the size parameter of resize2fs is larger than the Ext4 size, expanding the file system results in metadata corruption.
  • Figure 2: Methods of Configuring File Systems. This figure shows four typical stages to configure a file system: (a) at creation (e.g., mke2fs) or mount time (mount) before usage; (b) via online utilities (e.g., e4defrag); (c) via offline utilities.
  • Figure 3: Overview of ConfD. There are two parts: (1) ConfD-core (yellow) for extracting configuration dependencies and generating critical states; (2) ConfD-plugins (green) for detecting various configuration-related issues.