Table of Contents
Fetching ...

HEP Statistical Inference for UAV Fault Detection: CLs, LRT, and SBI Applied to Blade Damage

Khushiyant

Abstract

This paper transfers three statistical methods from particle physics to multirotor propeller fault detection: the likelihood ratio test (LRT) for binary detection, the CLs modified frequentist method for false alarm rate control, and sequential neural posterior estimation (SNPE) for quantitative fault characterization. Operating on spectral features tied to rotor harmonic physics, the system returns three outputs: binary detection, controlled false alarm rates, and calibrated posteriors over fault severity and motor location. On UAV-FD, a hexarotor dataset of 18 real flights with 5% and 10% blade damage, leave-one-flight-out cross-validation gives AUC 0.862 +/- 0.007 (95% CI: 0.849--0.876), outperforming CUSUM (0.708 +/- 0.010), autoencoder (0.753 +/- 0.009), and LSTM autoencoder (0.551). At 5% false alarm rate the system detects 93% of significant and 81% of subtle blade damage. On PADRE, a quadrotor platform, AUC reaches 0.986 after refitting only the generative models. SNPE gives a full posterior over fault severity (90% credible interval coverage 92--100%, MAE 0.012), so the output includes uncertainty rather than just a point estimate or fault flag. Per-flight sequential detection achieves 100% fault detection with 94% overall accuracy.

HEP Statistical Inference for UAV Fault Detection: CLs, LRT, and SBI Applied to Blade Damage

Abstract

This paper transfers three statistical methods from particle physics to multirotor propeller fault detection: the likelihood ratio test (LRT) for binary detection, the CLs modified frequentist method for false alarm rate control, and sequential neural posterior estimation (SNPE) for quantitative fault characterization. Operating on spectral features tied to rotor harmonic physics, the system returns three outputs: binary detection, controlled false alarm rates, and calibrated posteriors over fault severity and motor location. On UAV-FD, a hexarotor dataset of 18 real flights with 5% and 10% blade damage, leave-one-flight-out cross-validation gives AUC 0.862 +/- 0.007 (95% CI: 0.849--0.876), outperforming CUSUM (0.708 +/- 0.010), autoencoder (0.753 +/- 0.009), and LSTM autoencoder (0.551). At 5% false alarm rate the system detects 93% of significant and 81% of subtle blade damage. On PADRE, a quadrotor platform, AUC reaches 0.986 after refitting only the generative models. SNPE gives a full posterior over fault severity (90% credible interval coverage 92--100%, MAE 0.012), so the output includes uncertainty rather than just a point estimate or fault flag. Per-flight sequential detection achieves 100% fault detection with 94% overall accuracy.
Paper Structure (46 sections, 4 equations, 4 figures, 5 tables)

This paper contains 46 sections, 4 equations, 4 figures, 5 tables.

Figures (4)

  • Figure 1: Physical basis for fault detection. (a) Spectral density elevated in the 80--150 Hz band during 10% blade damage confirms the rotor harmonic signal. (b) Cohen's $d$ ranking shows spectral features are 2.5$\times$ more discriminative than time-domain statistics, motivating the feature space choice.
  • Figure 2: Detection performance on UAV-FD under LOFO. (a) The composite LRT statistic separates cleanly between healthy and faulty windows. (b) LRT dominates all baselines on the ROC curve, with deep learning underperforming due to overfitting on small training sets. (c) Each ablation component contributes measurable AUC improvement, showing each design choice adds measurable value.
  • Figure 3: Operational deployment via sequential testing. (a) The raw LRT statistic is highly volatile; exponential moving average (EMA) smoothing prevents transient noise from crossing the detection threshold. (b) A heatmap of the majority vote decisions across all 18 LOFO folds, achieving 100% fault recall with a single conservative false alarm.
  • Figure 4: Simulation-Based Inference (SBI) severity posteriors. Unlike binary classifiers, the neural density estimator produces calibrated uncertainty. The 10% fault posterior is sharply concentrated, whereas the 5% fault posterior accurately reflects the broader statistical ambiguity of subtle damage.