Designing Time-Series Models With Hypernetworks & Adversarial Portfolios

Filip Staněk

Designing Time-Series Models With Hypernetworks & Adversarial Portfolios

Filip Staněk

TL;DR

This work tackles time-series forecasting under task heterogeneity by introducing MtMs, a hypernetwork-based meta-learning framework that outputs task-specific parametric models through mesa parameters. The approach enables end-to-end backpropagation to jointly optimize meta-parameters and per-task modifiers, effectively balancing global structure with local variation across assets. Empirically, MtMs demonstrates strong performance on sinusoidal regression and the M4 dataset, and achieves competitive results in the M6 forecasting challenge, while the accompanying investment strategy explores an adversarial yet risk-conscious method to improve leaderboard rankings. The study highlights the practical potential of task-conditioned parametric modeling for financial forecasting and suggests broad applicability to other meta-learning scenarios beyond finance.

Abstract

This article describes the methods that achieved 4th and 6th place in the forecasting and investment challenges, respectively, of the M6 competition, ultimately securing the 1st place in the overall duathlon ranking. In the forecasting challenge, we tested a novel meta-learning model that utilizes hypernetworks to design a parametric model tailored to a specific family of forecasting tasks. This approach allowed us to leverage similarities observed across individual forecasting tasks while also acknowledging potential heterogeneity in their data generating processes. The model's training can be directly performed with backpropagation, eliminating the need for reliance on higher-order derivatives and is equivalent to a simultaneous search over the space of parametric functions and their optimal parameter values. The proposed model's capabilities extend beyond M6, demonstrating superiority over state-of-the-art meta-learning methods in the sinusoidal regression task and outperforming conventional parametric models on time-series from the M4 competition. In the investment challenge, we adjusted portfolio weights to induce greater or smaller correlation between our submission and that of other participants, depending on the current ranking, aiming to maximize the probability of achieving a good rank.

Designing Time-Series Models With Hypernetworks & Adversarial Portfolios

TL;DR

Abstract

Paper Structure (20 sections, 1 theorem, 14 equations, 7 figures, 3 tables)

This paper contains 20 sections, 1 theorem, 14 equations, 7 figures, 3 tables.

Introduction
Forecasting challenge
Model
Motivation
Architecture
Application to the M6 competition
Data augmention
Features
Model & training
Post-processing & predictions
Investment challenge
Scaling
Strategic positions
Investment decisions
Conclusions
...and 5 more sections

Key Result

Proposition 1

Under assumptions A1 and A2, there exist functions $f(\cdot; \beta): \mathbb{R}^{d_x} \rightarrow \mathbb{R}^{d_y}$ parameterized by $\beta \in B$ and $g(\cdot; \omega):\Theta \rightarrow B$ parameterized by $\omega \in \Omega$, such that the solution of coincides with the solution of the bilevel optimization problem introduced in Eq. eq:sample_bilevel_opt.

Figures (7)

Figure 1: A diagram of the MtMs model for an illustrative example with 6 features and 5 tasks. The process of generating forecasts proceeds from the right to left. First, a one-hot encoded vector $q$, denoting to which task the observation belongs, is multiplied by a matrix of mesa parameters $(\theta^{(1)},\,...\,,\theta^{(M)})$ to extract the corresponding task-specific mesa parameter vector $\theta$. This vector is then passed to the meta module $g(\theta; \omega)$ to generate task-specific parameters $\beta$ of the base model $f(x;\beta)$. Lastly, the network $f(x;\beta)$ is used to process the corresponding feature vector $x$ and generate the prediction $\hat{y}$.
Figure 2: A diagram of the MtMs model applied to M6. In the case of M6, there are 1000 tasks/assets (100 specified by the organizers and 900 from the additional 9 auxiliary M6-like datasets). Each asset is allotted one univariate mesa parameter $\theta$, which, through the meta module $g(\theta; \omega)$, determines the parameters $\beta$ of the network $f(x;\beta)$. This network then processes the corresponding feature vector $x$ to generate the prediction $\hat{y}$. The meta module $g(\theta; \omega)$ is a trivial single-layer neural network that connects $\theta$ to the weights and biases of the last layer of the network $f$; $\beta_{connected}$. The remaining nodes corresponding to parameters $\beta_{orphaned}$ are not influenced by $\theta$ and are hence constant across all tasks/assets.
Figure 3: Predicted probabilities of the 1st quintile plotted against the probabilities of the 5th quintile (upper panel) and predicted probabilities of the 2nd quintile plotted against the probabilities of the 4th quintile (lower panel).
Figure 4: Portfolio weights and the overall performance of the portfolio across individual submissions. Performance of the M6 dummy portfolio and the average performance of participants for comparison.
Figure A.5: MtMs predictions for sinusoidal task ($K=5$) Plots of $f_{\omega}(x;\theta)$ as a function of $x$ for different values of the mesa parameter vector $\theta$. In the upper panel, the first mesa parameter $\theta[1]$ varies while $\theta[2]$ is fixed to its median value $-0.013$. In the lower panel, the second mesa parameter $\theta[2]$ varies while $\theta[1]$ is fixed to its median value $0.017$.
...and 2 more figures

Theorems & Definitions (1)

Proposition 1

Designing Time-Series Models With Hypernetworks & Adversarial Portfolios

TL;DR

Abstract

Designing Time-Series Models With Hypernetworks & Adversarial Portfolios

Authors

TL;DR

Abstract

Table of Contents

Key Result

Figures (7)

Theorems & Definitions (1)