Table of Contents
Fetching ...

SFILES 2.0: An extended text-based flowsheet representation

Gabriel Vogel, Lukas Schulze Balhorn, Edwin Hirtreiter, Artur M. Schweidtmann

TL;DR

This work addresses the need for machine-readable, interoperable representations of chemical process flowsheets beyond static images and PDFs. It introduces SFILES 2.0, an extended text-based notation that captures complex connectivity, multi-stream heat exchangers, top/bottom product branches, and P&ID level control structures, together with standardized unit operation naming. A reversible conversion algorithm (graph invariant computation followed by DFS traversal) and an open-source Python implementation enable bidirectional translation between flowsheet graphs and SFILES 2.0 strings, facilitating FAIR data and scalable database construction. The approach enhances data analysis and AI-enabled processing of flowsheets by providing a standardized, machine-actionable representation and tooling to publish and reuse topology information across research and industry.

Abstract

SFILES is a text-based notation for chemical process flowsheets. It was originally proposed by d'Anterroches (2006) who was inspired by the text-based SMILES notation for molecules. The text-based format has several advantages compared to flowsheet images regarding the storage format, computational accessibility, and eventually for data analysis and processing. However, the original SFILES version cannot describe essential flowsheet configurations unambiguously, such as the distinction between top and bottom products. Neither is it capable of describing the control structure required for the safe and reliable operation of chemical processes. Also, there is no publicly available software for decoding or encoding chemical process topologies to SFILES. We propose the SFILES 2.0 with a complete description of the extended notation and naming conventions. Additionally, we provide open-source software for the automated conversion between flowsheet graphs and SFILES 2.0 strings. This way, we hope to encourage researchers and engineers to publish their flowsheet topologies as SFILES 2.0 strings. The ultimate goal is to set the standards for creating a FAIR database of chemical process flowsheets, which would be of great value for future data analysis and processing.

SFILES 2.0: An extended text-based flowsheet representation

TL;DR

This work addresses the need for machine-readable, interoperable representations of chemical process flowsheets beyond static images and PDFs. It introduces SFILES 2.0, an extended text-based notation that captures complex connectivity, multi-stream heat exchangers, top/bottom product branches, and P&ID level control structures, together with standardized unit operation naming. A reversible conversion algorithm (graph invariant computation followed by DFS traversal) and an open-source Python implementation enable bidirectional translation between flowsheet graphs and SFILES 2.0 strings, facilitating FAIR data and scalable database construction. The approach enhances data analysis and AI-enabled processing of flowsheets by providing a standardized, machine-actionable representation and tooling to publish and reuse topology information across research and industry.

Abstract

SFILES is a text-based notation for chemical process flowsheets. It was originally proposed by d'Anterroches (2006) who was inspired by the text-based SMILES notation for molecules. The text-based format has several advantages compared to flowsheet images regarding the storage format, computational accessibility, and eventually for data analysis and processing. However, the original SFILES version cannot describe essential flowsheet configurations unambiguously, such as the distinction between top and bottom products. Neither is it capable of describing the control structure required for the safe and reliable operation of chemical processes. Also, there is no publicly available software for decoding or encoding chemical process topologies to SFILES. We propose the SFILES 2.0 with a complete description of the extended notation and naming conventions. Additionally, we provide open-source software for the automated conversion between flowsheet graphs and SFILES 2.0 strings. This way, we hope to encourage researchers and engineers to publish their flowsheet topologies as SFILES 2.0 strings. The ultimate goal is to set the standards for creating a FAIR database of chemical process flowsheets, which would be of great value for future data analysis and processing.
Paper Structure (16 sections, 8 figures, 3 tables)

This paper contains 16 sections, 8 figures, 3 tables.

Figures (8)

  • Figure 1: (a) Simple chemical process flowsheet with branches and one recycle stream. (b) Graph representation of the flowsheet in (a).
  • Figure 2: Flowsheet with complex connectivity characteristics. (a) PFD, (b) graph representation
  • Figure 3: Flowsheet graph with modified node structure of heat exchanger and connectivity attributes for distillation column
  • Figure 4: (a) Absorption column with two inlets and two outlets. (b) Flowsheet graph of (a) with connectivity stream tags.
  • Figure 5: PFD and flowsheet graph of simple control loops. (a) Flow control of material stream, (b) Level control of tank, (c) Level control of tank with control cascade
  • ...and 3 more figures