Sig-Networks Toolkit: Signature Networks for Longitudinal Language Modelling

Talia Tseriotou; Ryan Sze-Yin Chan; Adam Tsakalidis; Iman Munire Bilal; Elena Kochkina; Terry Lyons; Maria Liakata

Sig-Networks Toolkit: Signature Networks for Longitudinal Language Modelling

Talia Tseriotou, Ryan Sze-Yin Chan, Adam Tsakalidis, Iman Munire Bilal, Elena Kochkina, Terry Lyons, Maria Liakata

TL;DR

Sig-Networks introduces a pioneering toolkit for longitudinal NLP by leveraging path signatures and log-signatures to compress and aggregate sequential text data. It provides a complete, pip-installable pipeline (nlpsig preprocessing plus sig-networks PyTorch models) with flexible time-feature integration and hyperparameter tuning, achieving state-of-the-art performance on three temporally granular tasks. The approach combines Signature Window Network Units, attention-based variants, and Seq-Sig-Net to capture both short- and long-range temporal dependencies, with depth $N=3$ used for log-signatures, and demonstrates robustness across tasks ranging from seconds to hours in temporal granularity. This toolkit offers a practical, extensible framework for researchers and developers to plug in data and extend longitudinal NLP capabilities in real-world applications.

Abstract

We present an open-source, pip installable toolkit, Sig-Networks, the first of its kind for longitudinal language modelling. A central focus is the incorporation of Signature-based Neural Network models, which have recently shown success in temporal tasks. We apply and extend published research providing a full suite of signature-based models. Their components can be used as PyTorch building blocks in future architectures. Sig-Networks enables task-agnostic dataset plug-in, seamless pre-processing for sequential data, parameter flexibility, automated tuning across a range of models. We examine signature networks under three different NLP tasks of varying temporal granularity: counselling conversations, rumour stance switch and mood changes in social media threads, showing SOTA performance in all three, and provide guidance for future tasks. We release the Toolkit as a PyTorch package with an introductory video, Git repositories for preprocessing and modelling including sample notebooks on the modeled NLP tasks.

Sig-Networks Toolkit: Signature Networks for Longitudinal Language Modelling

TL;DR

used for log-signatures, and demonstrates robustness across tasks ranging from seconds to hours in temporal granularity. This toolkit offers a practical, extensible framework for researchers and developers to plug in data and extend longitudinal NLP capabilities in real-world applications.

Abstract

Paper Structure (25 sections, 3 figures, 5 tables)

This paper contains 25 sections, 3 figures, 5 tables.

Introduction
Related Work
Methodological Foundations
Task Formulation and Background
System Overview
Feature Encoding
Signature Network Models
System Components
Data Preparation Modules in nlpsig
Training
Model Modules
Experiments
Tasks and Datasets
Models and Baselines
Results and Discussion
...and 10 more sections

Figures (3)

Figure 1: Sig-Networks Tooklit Overview. nlpsig library (left side) obtains the input text, label and stream id per data point. The package allows for embedding extraction (i.e. SBERT) and its dimensionality reduction, with optional non-linguistic-feature processing and concatenation. For each data point a stream/window (padded if necessary) is formed including its ordered history. These are shifted and stacked for unit-based models. Data splitting with k-fold option is performed. sig-networks library (right side) enables PyTorch implementation for all Sig-Networks family and baseline models with user-specified training and hyper parameter inputs.
Figure 2: Signature Window Unit and its variations.
Figure 3: Seq-Sig-Net and its variations using SWNU (yellow, see Fig. \ref{['fig:swunit']}) on a sample length of 11 points.

Sig-Networks Toolkit: Signature Networks for Longitudinal Language Modelling

TL;DR

Abstract

Sig-Networks Toolkit: Signature Networks for Longitudinal Language Modelling

Authors

TL;DR

Abstract

Table of Contents

Figures (3)