Cost-informed dimensionality reduction for structural digital twin technologies

Aidan J. Hughes; Keith Worden; Nikolaos Dervilis; Timothy J. Rogers

Cost-informed dimensionality reduction for structural digital twin technologies

Aidan J. Hughes, Keith Worden, Nikolaos Dervilis, Timothy J. Rogers

TL;DR

A decision-theoretic approach to dimensionality reduction for structural asset management, constructed as an eigenvalue problem, with separabilities between classes weighted according to the cost of misclassifying them when considered in the context of a decision process.

Abstract

Classification models are a key component of structural digital twin technologies used for supporting asset management decision-making. An important consideration when developing classification models is the dimensionality of the input, or feature space, used. If the dimensionality is too high, then the `curse of dimensionality' may rear its ugly head; manifesting as reduced predictive performance. To mitigate such effects, practitioners can employ dimensionality reduction techniques. The current paper formulates a decision-theoretic approach to dimensionality reduction for structural asset management. In this approach, the aim is to keep incurred misclassification costs to a minimum, as the dimensionality is reduced and discriminatory information may be lost. This formulation is constructed as an eigenvalue problem, with separabilities between classes weighted according to the cost of misclassifying them when considered in the context of a decision process. The approach is demonstrated using a synthetic case study.

Cost-informed dimensionality reduction for structural digital twin technologies

TL;DR

Abstract

Paper Structure (13 sections, 14 equations, 4 figures, 1 table)

This paper contains 13 sections, 14 equations, 4 figures, 1 table.

Introduction
Background Theory
Statistical Classification and Decision-making
Dimensionality Reduction
Principal Component Analysis
Linear Discriminant Analysis
Cost-informed Dimensionality Reduction
Cost Function
Formulation
A Visual Case Study
Results
Discussion
Conclusions

Figures (4)

Figure 1: Synthetic dataset in its original three dimensions.
Figure 2: Possible two-dimensional projections of the dataset retaining the original features $x_1$, $x_2$, and $x_3$. The combinations shown are as follows (a) $\{x_1,x_2\}$, (b) $\{x_1,x_3\}$, and (c) $\{x_2,x_3\}$.
Figure 3: Two-dimensional projections of the dataset, transformed using the first two eigenvectors obtained via (a) cost-informed dimensionality reduction, (b) LDA, and (c) PCA.
Figure 4: Box plots representing the distributions of total misclassification costs for models trained on data projected using PCA, LDA, and cost-informed LDA, as the number of dimensions is reduced.

Cost-informed dimensionality reduction for structural digital twin technologies

TL;DR

Abstract

Cost-informed dimensionality reduction for structural digital twin technologies

Authors

TL;DR

Abstract

Table of Contents

Figures (4)