Table of Contents
Fetching ...

Causal knowledge engineering: A case study from COVID-19

Steven Mascaro, Yue Wu, Ross Pearson, Owen Woodberry, Jessica Ramsay, Tom Snelling, Ann E. Nicholson

TL;DR

This paper presents Causal Knowledge Engineering (CKE), a structured, iterative method for building a causal knowledge base of CKBNs to support application-specific models in contexts of severe uncertainty such as the COVID-19 pandemic. It integrates expert elicitation, literature review, and data analyses to craft a hierarchical network of standalone, causally coherent CKBNs anchored by a top-level framework. The COVID-19 case study demonstrates rapid creation and refinement of CKBNs for diagnosis, pathophysiology, and complications, highlighting the role of qualitative parameterisation and reusable causal knowledge as a foundation for future prognostic and decision-support models. The work argues that a CKBN-centric knowledge base improves consistency, reusability, and collaboration, and discusses a comprehensive set of elicitation techniques, structural design rules, and validation practices for broader adoption in health and other domains.

Abstract

COVID-19 appeared abruptly in early 2020, requiring a rapid response amid a context of great uncertainty. Good quality data and knowledge was initially lacking, and many early models had to be developed with causal assumptions and estimations built in to supplement limited data, often with no reliable approach for identifying, validating and documenting these causal assumptions. Our team embarked on a knowledge engineering process to develop a causal knowledge base consisting of several causal BNs for diverse aspects of COVID-19. The unique challenges of the setting lead to experiments with the elicitation approach, and what emerged was a knowledge engineering method we call Causal Knowledge Engineering (CKE). The CKE provides a structured approach for building a causal knowledge base that can support the development of a variety of application-specific models. Here we describe the CKE method, and use our COVID-19 work as a case study to provide a detailed discussion and analysis of the method.

Causal knowledge engineering: A case study from COVID-19

TL;DR

This paper presents Causal Knowledge Engineering (CKE), a structured, iterative method for building a causal knowledge base of CKBNs to support application-specific models in contexts of severe uncertainty such as the COVID-19 pandemic. It integrates expert elicitation, literature review, and data analyses to craft a hierarchical network of standalone, causally coherent CKBNs anchored by a top-level framework. The COVID-19 case study demonstrates rapid creation and refinement of CKBNs for diagnosis, pathophysiology, and complications, highlighting the role of qualitative parameterisation and reusable causal knowledge as a foundation for future prognostic and decision-support models. The work argues that a CKBN-centric knowledge base improves consistency, reusability, and collaboration, and discusses a comprehensive set of elicitation techniques, structural design rules, and validation practices for broader adoption in health and other domains.

Abstract

COVID-19 appeared abruptly in early 2020, requiring a rapid response amid a context of great uncertainty. Good quality data and knowledge was initially lacking, and many early models had to be developed with causal assumptions and estimations built in to supplement limited data, often with no reliable approach for identifying, validating and documenting these causal assumptions. Our team embarked on a knowledge engineering process to develop a causal knowledge base consisting of several causal BNs for diverse aspects of COVID-19. The unique challenges of the setting lead to experiments with the elicitation approach, and what emerged was a knowledge engineering method we call Causal Knowledge Engineering (CKE). The CKE provides a structured approach for building a causal knowledge base that can support the development of a variety of application-specific models. Here we describe the CKE method, and use our COVID-19 work as a case study to provide a detailed discussion and analysis of the method.
Paper Structure (65 sections, 9 figures, 1 table)

This paper contains 65 sections, 9 figures, 1 table.

Figures (9)

  • Figure S1: With each iteration, the amount of direct knowledge grows, in turn supporting each new version of the model. Direct knowledge here refers to knowledge (causal and non-causal) directly applicable to the current modelling problem. Transferable knowledge refers to knowledge about other problems or domains that can be applied to the current modelling problem. Inferable knowledge sits in between --- it is knowledge that can be inferred from either the direct or transferable knowledge, that is not currently a part of either. (As a simple example, if it is known that A causes B and separately that B causes C, it may still not be explicitly known or recognised that A causes C.)
  • Figure S2: The function (external relationships) and structure (internal form) of the causal knowledge base. We define models to be any existing relevant model (BN or otherwise), literature to be published reports that may or may not be peer-reviewed where the reports may be descriptive summaries, or may include the results of analyses and conclusions, data to be raw recorded values that are not summarised or analysed, and experts to hold beliefs that may, to a varying extent, be informed by the other two or may just be from personal experience and/or extrapolation from related experiences. Inference can be applied in conjunction with any or all knowledge sources, as depicted in Figure \ref{['fig:know_growth']}.
  • Figure S3: Workflows for the causal knowledge base (left) and and individual CKBNs (right). The workflow for the causal knowledge base proceeds through 5 major steps, starting with the purpose and scope, proceeding through reviews, expert recruitment and then development of the causal knowledge base itself (the top-leve framework and all the constituent CKBNs). The development of an individual CKBN is initially similar, focusing on purpose and scope, and then proceeds through the familiar stages of causal BN development; here we provide additional recommendations on what these stages might contain.
  • Figure S4: Multiple causal BNs for modelling the relationship of Virus enters the NP (Nasopharynx) to Multi-Organ Failure, in addition to the other (major) causal BN (the full Respiratory BN) from mascaro+2023.
  • Figure S5: A typical CKBN elicitation workflow. (This workflow focuses specifically on expert elicitation of a CKBN; compare to Figure \ref{['fig:elicitation_process']}, which applies to the development of a CKBN using any causal knowledge source(s).) There are 5 main stages, which consist of general CKBN elicitation preparation, design of the elicitation strategy, development of materials for workshops, conducting workshops and revisions and followups.
  • ...and 4 more figures