Representation Learning of Daily Movement Data Using Text Encoders
Alexander Capstick, Tianyu Cui, Yu Chen, Payam Barnaghi
TL;DR
This work tackles learning meaningful representations from irregular, discrete-valued in-home activity time-series for people with Dementia. The authors convert each day into a text string and fine-tune a sentence-embedding model (SE-MiniLM) with a triplet loss over a $30$-day window to produce personalized day embeddings. These embeddings enable clustering into distinct activity patterns, vector search for similar days, and monitoring of behavior changes, with initial evidence from UTI-label analyses and visualizations (e.g., 5 clusters identified by k-means). By leveraging pre-trained language-model representations and semantic similarity in vector space, the approach supports clinically relevant retrieval and change detection to inform personalised care delivery.
Abstract
Time-series representation learning is a key area of research for remote healthcare monitoring applications. In this work, we focus on a dataset of recordings of in-home activity from people living with Dementia. We design a representation learning method based on converting activity to text strings that can be encoded using a language model fine-tuned to transform data from the same participants within a $30$-day window to similar embeddings in the vector space. This allows for clustering and vector searching over participants and days, and the identification of activity deviations to aid with personalised delivery of care.
