Table of Contents
Fetching ...

Deep Heterogeneous Contrastive Hyper-Graph Learning for In-the-Wild Context-Aware Human Activity Recognition

Wen Ge, Guanyi Mou, Emmanuel O. Agu, Kyumin Lee

TL;DR

A Deep Heterogeneous Contrastive Hyper-Graph Learning (DHC-HGL) framework that captures heterogenous Context-Aware HAR (CA-HAR) hypergraph properties in a message-passing and neighborhood-aggregation fashion is proposed.

Abstract

Human Activity Recognition (HAR) is a challenging, multi-label classification problem as activities may co-occur and sensor signals corresponding to the same activity may vary in different contexts (e.g., different device placements). This paper proposes a Deep Heterogeneous Contrastive Hyper-Graph Learning (DHC-HGL) framework that captures heterogenous Context-Aware HAR (CA-HAR) hypergraph properties in a message-passing and neighborhood-aggregation fashion. Prior work only explored homogeneous or shallow-node-heterogeneous graphs. DHC-HGL handles heterogeneous CA-HAR data by innovatively 1) Constructing three different types of sub-hypergraphs that are each passed through different custom HyperGraph Convolution (HGC) layers designed to handle edge-heterogeneity and 2) Adopting a contrastive loss function to ensure node-heterogeneity. In rigorous evaluation on two CA-HAR datasets, DHC-HGL significantly outperformed state-of-the-art baselines by 5.8% to 16.7% on Matthews Correlation Coefficient (MCC) and 3.0% to 8.4% on Macro F1 scores. UMAP visualizations of learned CA-HAR node embeddings are also presented to enhance model explainability.

Deep Heterogeneous Contrastive Hyper-Graph Learning for In-the-Wild Context-Aware Human Activity Recognition

TL;DR

A Deep Heterogeneous Contrastive Hyper-Graph Learning (DHC-HGL) framework that captures heterogenous Context-Aware HAR (CA-HAR) hypergraph properties in a message-passing and neighborhood-aggregation fashion is proposed.

Abstract

Human Activity Recognition (HAR) is a challenging, multi-label classification problem as activities may co-occur and sensor signals corresponding to the same activity may vary in different contexts (e.g., different device placements). This paper proposes a Deep Heterogeneous Contrastive Hyper-Graph Learning (DHC-HGL) framework that captures heterogenous Context-Aware HAR (CA-HAR) hypergraph properties in a message-passing and neighborhood-aggregation fashion. Prior work only explored homogeneous or shallow-node-heterogeneous graphs. DHC-HGL handles heterogeneous CA-HAR data by innovatively 1) Constructing three different types of sub-hypergraphs that are each passed through different custom HyperGraph Convolution (HGC) layers designed to handle edge-heterogeneity and 2) Adopting a contrastive loss function to ensure node-heterogeneity. In rigorous evaluation on two CA-HAR datasets, DHC-HGL significantly outperformed state-of-the-art baselines by 5.8% to 16.7% on Matthews Correlation Coefficient (MCC) and 3.0% to 8.4% on Macro F1 scores. UMAP visualizations of learned CA-HAR node embeddings are also presented to enhance model explainability.
Paper Structure (19 sections, 19 equations, 11 figures, 5 tables)

This paper contains 19 sections, 19 equations, 11 figures, 5 tables.

Figures (11)

  • Figure 1: Accelerometer signal corresponding to the Walking activity in various contexts and performers from real-world Extrasensory dataset vaizman2017recognizing. Comparing Fig. \ref{['fig:u1walkbag']} and Fig. \ref{['fig:u1walkhand']}, we observe disparate accelerometer readings of the same activity under different contexts. Meanwhile, as we compare Fig. \ref{['fig:u1walkhand']} and Fig. \ref{['fig:h2walkhand']}, different users might perform the same activity differently, producing distinct sensor readings, even with the same contextual factors.
  • Figure 2: Real-world examples mapping into our heterogeneous hypergraph. A CA-HAR task has three types of nodes: users $u$, activities $a$, and sensor context $c$. We use red, green, and blue colors to represent these nodes with heterogeneity. The first example shows ${User}_{1}$ ($u_1$) is typing ($a_{Ty}$) with phone in hand ($C_H$). Thus, a hyperedge connecting three nodes $\{u_1, a_{Ty}, c_{H}\}$ is formed to represent the situation. Another example showcased scenarios where activities may co-occur: $User_2$ ($u_2$) is sitting ($a_{S}$) and talking ($a_{Ta}$) simultaneously with phone on table ($c_T$). In this case, a hyperedge connecting four nodes $\{u_2, (a_{Ta}, a_{S}), c_{T}\}$ is needed to well-represent the situation. A corresponding incidence matrix is also shown to represent the subgraph with only these seven nodes and two hyperedges.
  • Figure 3: Prior work (Fig. \ref{['fig:ordinarygraph']} and Fig. \ref{['fig:hypergraph']}) constructs different CA-HAR graphs. Our approach (Fig. \ref{['fig:dhc-hgl']}) considers both node-heterogeneity and edge-heterogeneity. Intuitively, Fig. \ref{['fig:ordinarygraph']} is a simplified version of Fig. \ref{['fig:hypergraph']}, while Fig. \ref{['fig:dhc-hgl']} further generalizes beyond Fig. \ref{['fig:ordinarygraph']} and Fig. \ref{['fig:hypergraph']}. Fig. \ref{['fig:illustration']} presents an illustration with different sub-hypergraphs.
  • Figure 4: The original heterogeneous hypergraph is further transformed into three sub-graphs that account for different types of hyperedges. In each sub-graph, node-heterogeneity is represented using different colors. However, only one type of hyperedge is incorporated. Thus, the sub-graphs capture graph information at a higher granularity.
  • Figure 5: Overview of our DHC-HGL Framework. It consists of two key components: the graph learning module for label encoding and the classification module for signal encoding. Given the heterogeneous hypergraph formulated (each node is defined as the user, context, and activity tuple), the model updates node representations using a GNN while factoring in node-heterogeneity (via separate projections and custom HGC layers), and edge-heterogeneity (via split subgraphs). During classification, the learned label encoding is utilized to infer connected nodes for the given signal. The objective loss function combines BCE loss for multi-label classification and contrastive loss that handles node-heterogeneity.
  • ...and 6 more figures

Theorems & Definitions (1)

  • Definition 1