Table of Contents
Fetching ...

Logi-PAR: Logic-Infused Patient Activity Recognition via Differentiable Rule

Muhammad Zarar, MingZheng Zhang, Xiaowang Zhang, Zhiyong Feng, Sofonias Yitagesu, Kawsar Farooq

TL;DR

Logi-PAR is the first framework to recognize patient activity by applying learnable logic rules to symbolic mappings, and demonstrates state-of-the-art performance, significantly outperforming Vision-Language Models and transformer baselines.

Abstract

Patient Activity Recognition (PAR) in clinical settings uses activity data to improve safety and quality of care. Although significant progress has been made, current models mainly identify which activity is occurring. They often spatially compose sub-sparse visual cues using global and local attention mechanisms, yet only learn logically implicit patterns due to their neural-pipeline. Advancing clinical safety requires methods that can infer why a set of visual cues implies a risk, and how these can be compositionally reasoned through explicit logic beyond mere classification. To address this, we proposed Logi-PAR, the first Logic-Infused Patient Activity Recognition Framework that integrates contextual fact fusion as a multi-view primitive extractor and injects neural-guided differentiable rules. Our method automatically learns rules from visual cues, optimizing them end-to-end while enabling the implicit emergence patterns to be explicitly labelled during training. To the best of our knowledge, Logi-PAR is the first framework to recognize patient activity by applying learnable logic rules to symbolic mappings. It produces auditable why explanations as rule traces and supports counterfactual interventions (e.g., risk would decrease by 65% if assistance were present). Extensive evaluation on clinical benchmarks (VAST and OmniFall) demonstrates state-of-the-art performance, significantly outperforming Vision-Language Models and transformer baselines. The code is available via: https://github.com/zararkhan985/Logi-PAR.git}

Logi-PAR: Logic-Infused Patient Activity Recognition via Differentiable Rule

TL;DR

Logi-PAR is the first framework to recognize patient activity by applying learnable logic rules to symbolic mappings, and demonstrates state-of-the-art performance, significantly outperforming Vision-Language Models and transformer baselines.

Abstract

Patient Activity Recognition (PAR) in clinical settings uses activity data to improve safety and quality of care. Although significant progress has been made, current models mainly identify which activity is occurring. They often spatially compose sub-sparse visual cues using global and local attention mechanisms, yet only learn logically implicit patterns due to their neural-pipeline. Advancing clinical safety requires methods that can infer why a set of visual cues implies a risk, and how these can be compositionally reasoned through explicit logic beyond mere classification. To address this, we proposed Logi-PAR, the first Logic-Infused Patient Activity Recognition Framework that integrates contextual fact fusion as a multi-view primitive extractor and injects neural-guided differentiable rules. Our method automatically learns rules from visual cues, optimizing them end-to-end while enabling the implicit emergence patterns to be explicitly labelled during training. To the best of our knowledge, Logi-PAR is the first framework to recognize patient activity by applying learnable logic rules to symbolic mappings. It produces auditable why explanations as rule traces and supports counterfactual interventions (e.g., risk would decrease by 65% if assistance were present). Extensive evaluation on clinical benchmarks (VAST and OmniFall) demonstrates state-of-the-art performance, significantly outperforming Vision-Language Models and transformer baselines. The code is available via: https://github.com/zararkhan985/Logi-PAR.git}
Paper Structure (21 sections, 13 equations, 4 figures, 2 tables)

This paper contains 21 sections, 13 equations, 4 figures, 2 tables.

Figures (4)

  • Figure 1: Overcoming Black-Box Correlations to Logical Inference in Clinical Risk Detection. Standard attention mechanisms frequently miss decisive but sparse risk cues, mistaking high-risk exits for resting states due to background bias. Logi-PAR overcomes this by enforcing a hierarchical inference structure: it first extracts reliable Atomic Facts via fine-grained attention and then applies learnable logic rules. This explicitly models the causal structure of an Unattended Bed Exit, ensuring correct classification and generating human-verifiable explanations.
  • Figure 2: Overview of the proposed Logi-PAR framework. The framework processes multi-view images through a shared perception backbone to extract a Probabilistic Fact Graph. These facts are then fed into the Differentiable Causal-Logic Layer, where the Gated Soft-Logic Composition (Eq. 2) dynamically assembles atomic facts into complex clinical risk states, enabling both accurate classification and explanations.
  • Figure 3: Differentiable Logic rules impact during training. The (Blue Line) visualizes how Logi-PAR maintains high accuracy, while the (Red Line) sparsity regularization ($\lambda_2$) forces the model to "prune" unnecessary logic gates, drastically reducing the number of active rules.
  • Figure 4: Visualization of practical deployment on VAST sample (Video ID: P04_Exit_03). int 3-Step PAR. Heatmaps compare the baseline global attention (top), which erroneously fixates on the pillow, against Logi-PAR's fact-specific attention (bottom). The backbone visualization confirms that Logi-PAR effectively distributes attention across views to resolve occlusion from multi-view, thereby providing the logic module ($\psi$) with a complete set of atomic facts for reliable PAR inference.