Table of Contents
Fetching ...

An Information Theory Treatment of Animal Movement Tracks

Wayne M Getz

TL;DR

The paper presents a novel information-theoretic framework for fine-scale animal movement analysis by converting relocation tracks into a hierarchical sequence of fixed-length movement elements (StaMEs), words, and canonical activity modes (CAMs). By computing entropies and divergence measures, it enables rigorous evaluation and comparison of segmentation and clustering choices, including a Jensen-Shannon ensemble divergence to assess coding efficacy across methods. The approach yields a quantitative basis for comparing information content and coding accuracy of movement tracks, offering a scalable tool that complements traditional change-point and HMM-based methods. Its emphasis on high-resolution data and parameterized, loss-aware coding holds significant potential for understanding how internal states and landscapes shape movement, with implications for ecology and responses to global change.

Abstract

Position recordings of the two-dimensional tracks of animals moving over landscapes has progressed over the past three decades from hourly to second-by-second locations. Track segmentation methods for analyzing the behavioral information in such relocation data has lagged somewhat behind, with scales of analysis currently at the sub-hourly to minute level. A new approach is needed to bring segmentation analysis down to a second-by-second level. Here, a fine-scale approach is presented that rests heavily on concepts from Shannon's Information Theory. In this paper, we first briefly review and update concepts relating to movement path segmentation. We then discuss how cluster analysis can be used to organize the smallest viable statistical movement elements (StaMEs), which are $μ$ steps long, and to code the next level of movement elements called ``words'' that are $m μ$ steps long. Centroids of these word clusters are identified as canonical activity modes (CAMs). Unlike current behavioral change point analysis and hidden Markov model segmentation schemes, the approach presented here allows us to provide entropy measures for movement paths, compute the coding efficiencies of derived StaMEs and CAMs, and to assess error rates in the allocation of strings of $m$ StaMEs to CAM types. In addition our approach allows us to employ the Jensen-Shannon divergence measure to assess and compare the best choices for the various parameters (number of steps in a StaME, number of StaME types, number of StaMEs in a word, number of CAM types), as well as the best clustering methods for generating segments that can then be used to interpret and predict sequences of higher order segments. The theory presented here provides another tool in our toolbox for dealing with the effects of global change on the movement and redistribution of animals across altered landscapes.

An Information Theory Treatment of Animal Movement Tracks

TL;DR

The paper presents a novel information-theoretic framework for fine-scale animal movement analysis by converting relocation tracks into a hierarchical sequence of fixed-length movement elements (StaMEs), words, and canonical activity modes (CAMs). By computing entropies and divergence measures, it enables rigorous evaluation and comparison of segmentation and clustering choices, including a Jensen-Shannon ensemble divergence to assess coding efficacy across methods. The approach yields a quantitative basis for comparing information content and coding accuracy of movement tracks, offering a scalable tool that complements traditional change-point and HMM-based methods. Its emphasis on high-resolution data and parameterized, loss-aware coding holds significant potential for understanding how internal states and landscapes shape movement, with implications for ecology and responses to global change.

Abstract

Position recordings of the two-dimensional tracks of animals moving over landscapes has progressed over the past three decades from hourly to second-by-second locations. Track segmentation methods for analyzing the behavioral information in such relocation data has lagged somewhat behind, with scales of analysis currently at the sub-hourly to minute level. A new approach is needed to bring segmentation analysis down to a second-by-second level. Here, a fine-scale approach is presented that rests heavily on concepts from Shannon's Information Theory. In this paper, we first briefly review and update concepts relating to movement path segmentation. We then discuss how cluster analysis can be used to organize the smallest viable statistical movement elements (StaMEs), which are steps long, and to code the next level of movement elements called ``words'' that are steps long. Centroids of these word clusters are identified as canonical activity modes (CAMs). Unlike current behavioral change point analysis and hidden Markov model segmentation schemes, the approach presented here allows us to provide entropy measures for movement paths, compute the coding efficiencies of derived StaMEs and CAMs, and to assess error rates in the allocation of strings of StaMEs to CAM types. In addition our approach allows us to employ the Jensen-Shannon divergence measure to assess and compare the best choices for the various parameters (number of steps in a StaME, number of StaME types, number of StaMEs in a word, number of CAM types), as well as the best clustering methods for generating segments that can then be used to interpret and predict sequences of higher order segments. The theory presented here provides another tool in our toolbox for dealing with the effects of global change on the movement and redistribution of animals across altered landscapes.
Paper Structure (6 sections, 10 equations, 1 figure, 2 tables)

This paper contains 6 sections, 10 equations, 1 figure, 2 tables.

Figures (1)

  • Figure 1: A. A graphic depiction of tasks 1-6 required to code an animal movement relocation data series ($\mathcal{T}^{\rm loc}$) into a statistical movement elements (StaMEs) data series $\mathcal{T}^{\sigma}$ and a canonical activity modes (CAMs) data series $\mathcal{T}^{\rm CAM}$ and compute the coding accuracy $E^{\mathcal{\kappa}}$ of the process: i.e., implement the functor $\mathcal{M}:\mathcal{T}^{\rm loc} \mapsto \left\{\mathcal{T}^{\sigma},\mathcal{T}^{\rm CAM},E^{\mathcal{\kappa}} \right\}$. Parameters values $\mu,n,m,$ and $k$ need to be either a priori selected or determined during implementation of the clustering methods are highlighted in blue. Extracted objects are: segments in green (Tasks 1 and 3), symbols in brown (Task 2), words in fuchsia (Tasks 3-5), CAMs in orange (Tasks 4 and 6), and a coding accuracy measure in red (Task 6). With regard to the latter, the red dotted arrows between Tasks 5 and 6 indicate misassignment of a proportion of the $N_{\ell}$ words of type $\omega_\ell$ ($\ell=1,\cdots,n^m$; $N^{\rm wd} = \sum_{\ell=1}^{n^m} N_\ell$) as instances of particular CAMs assigned to $\kappa_{c_1}^\star$ when initially a member of $\mathcal{W}_{c_2}$ for $c_1 \ne c_2$, $c_1,c_2 \in \{1,\cdots,k\}$). A summary of the tasks and objects produced at each step of the process is provided at the bottom of this graphic. (Note: calligraphy letters in the graph and caption are the same symbol, but generated by different font sets)