Table of Contents
Fetching ...

Hedging Is Not All You Need: A Simple Baseline for Online Learning Under Haphazard Inputs

Himanshu Buckchash, Momojit Biswas, Rohit Agarwal, Dilip K. Prasad

TL;DR

The paper tackles online learning with haphazard inputs from sensor streams where the input space varies and data reliability is imperfect. It reframes hedging as a weighted residual mechanism and introduces HapNet, a simple self-attention baseline that processes input features directly (without embedding) using masked auxiliary features and positional information, with inference omitting the randomization. To handle cases with completely unavailable or positionally uncorrelated inputs, HapNetPU adds a feedback loop (recurrence) to extend the model's capability. Evaluations on five time-series benchmarks show HapNet is competitive with state-of-the-art hedging-based approaches and remains robust across a range of ablations, suggesting that a lightweight self-attention architecture can generalize well to non-fixed input spaces and varying data availability.

Abstract

Handling haphazard streaming data, such as data from edge devices, presents a challenging problem. Over time, the incoming data becomes inconsistent, with missing, faulty, or new inputs reappearing. Therefore, it requires models that are reliable. Recent methods to solve this problem depend on a hedging-based solution and require specialized elements like auxiliary dropouts, forked architectures, and intricate network design. We observed that hedging can be reduced to a special case of weighted residual connection; this motivated us to approximate it with plain self-attention. In this work, we propose HapNet, a simple baseline that is scalable, does not require online backpropagation, and is adaptable to varying input types. All present methods are restricted to scaling with a fixed window; however, we introduce a more complex problem of scaling with a variable window where the data becomes positionally uncorrelated, and cannot be addressed by present methods. We demonstrate that a variant of the proposed approach can work even for this complex scenario. We extensively evaluated the proposed approach on five benchmarks and found competitive performance.

Hedging Is Not All You Need: A Simple Baseline for Online Learning Under Haphazard Inputs

TL;DR

The paper tackles online learning with haphazard inputs from sensor streams where the input space varies and data reliability is imperfect. It reframes hedging as a weighted residual mechanism and introduces HapNet, a simple self-attention baseline that processes input features directly (without embedding) using masked auxiliary features and positional information, with inference omitting the randomization. To handle cases with completely unavailable or positionally uncorrelated inputs, HapNetPU adds a feedback loop (recurrence) to extend the model's capability. Evaluations on five time-series benchmarks show HapNet is competitive with state-of-the-art hedging-based approaches and remains robust across a range of ablations, suggesting that a lightweight self-attention architecture can generalize well to non-fixed input spaces and varying data availability.

Abstract

Handling haphazard streaming data, such as data from edge devices, presents a challenging problem. Over time, the incoming data becomes inconsistent, with missing, faulty, or new inputs reappearing. Therefore, it requires models that are reliable. Recent methods to solve this problem depend on a hedging-based solution and require specialized elements like auxiliary dropouts, forked architectures, and intricate network design. We observed that hedging can be reduced to a special case of weighted residual connection; this motivated us to approximate it with plain self-attention. In this work, we propose HapNet, a simple baseline that is scalable, does not require online backpropagation, and is adaptable to varying input types. All present methods are restricted to scaling with a fixed window; however, we introduce a more complex problem of scaling with a variable window where the data becomes positionally uncorrelated, and cannot be addressed by present methods. We demonstrate that a variant of the proposed approach can work even for this complex scenario. We extensively evaluated the proposed approach on five benchmarks and found competitive performance.
Paper Structure (4 sections, 3 equations, 1 figure, 9 tables)

This paper contains 4 sections, 3 equations, 1 figure, 9 tables.

Figures (1)

  • Figure 1: The two proposed models are shown in (a) HapNet (for positionally correlated), (b) HapNetPU (for positionally uncorrelated case). Both explain how an input feature $f_t$ is processed. $\bigodot$ implies Hadamard product, $\circleddash$ implies inverse Hadamard, the operation of removing the values from specific positions and compressing the dimensions of the input feature at time $t$ in order to make them positionally uncorrelated. $f_t$ is input features at time $t$, $f^b_t$ is base features at time $t$, $f^a_t$ is auxiliary features at time $t$, $f^m_t$ is masked features at time $t$, $m^r_t$ is randomized mask at time $t$, $e^s_t$ is scalar position encoding at time $t$, $f^l_t$ is remaining features after feature loss at time $t$, $c_t$ is context at time $t$, $\hat{y}_t$ is predicted label at time $t$.