Table of Contents
Fetching ...

chatter: a Python library for applying information theory and AI/ML models to animal communication

Mason Youngblood

TL;DR

The paper addresses the limitation of discrete categorization in animal communication analysis by introducing chatter, a Python library for continuous latent-space analysis of vocalizations using information theory and modern neural architectures to represent sequences as trajectories in high-dimensional latent space. It delivers an end-to-end workflow—from preprocessing and segmentation to latent-feature extraction and downstream metrics—enabling measurement of complexity, predictability, similarity, and novelty without unit labels. The approach is taxonomy-agnostic and validated across birds, bats, whales, and primates, integrating tools like BirdNET and PaCMAP visualizations to provide accessible, flexible analysis. Overall, chatter lowers the barrier to applying advanced continuous representations to animal vocal repertoires and offers a modular framework that complements discrete-analysis tools.

Abstract

The study of animal communication often involves categorizing units into types (e.g. syllables in songbirds, or notes in humpback whales). While this approach is useful in many cases, it necessarily flattens the complexity and nuance present in real communication systems. chatter is a new Python library for analyzing animal communication in continuous latent space using information theory and modern machine learning techniques. It is taxonomically agnostic, and has been tested with the vocalizations of birds, bats, whales, and primates. By leveraging a variety of different architectures, including variational autoencoders and vision transformers, chatter represents vocal sequences as trajectories in high-dimensional latent space, bypassing the need for manual or automatic categorization of units. The library provides an end-to-end workflow -- from preprocessing and segmentation to model training and feature extraction -- that enables researchers to quantify the complexity, predictability, similarity, and novelty of vocal sequences.

chatter: a Python library for applying information theory and AI/ML models to animal communication

TL;DR

The paper addresses the limitation of discrete categorization in animal communication analysis by introducing chatter, a Python library for continuous latent-space analysis of vocalizations using information theory and modern neural architectures to represent sequences as trajectories in high-dimensional latent space. It delivers an end-to-end workflow—from preprocessing and segmentation to latent-feature extraction and downstream metrics—enabling measurement of complexity, predictability, similarity, and novelty without unit labels. The approach is taxonomy-agnostic and validated across birds, bats, whales, and primates, integrating tools like BirdNET and PaCMAP visualizations to provide accessible, flexible analysis. Overall, chatter lowers the barrier to applying advanced continuous representations to animal vocal repertoires and offers a modular framework that complements discrete-analysis tools.

Abstract

The study of animal communication often involves categorizing units into types (e.g. syllables in songbirds, or notes in humpback whales). While this approach is useful in many cases, it necessarily flattens the complexity and nuance present in real communication systems. chatter is a new Python library for analyzing animal communication in continuous latent space using information theory and modern machine learning techniques. It is taxonomically agnostic, and has been tested with the vocalizations of birds, bats, whales, and primates. By leveraging a variety of different architectures, including variational autoencoders and vision transformers, chatter represents vocal sequences as trajectories in high-dimensional latent space, bypassing the need for manual or automatic categorization of units. The library provides an end-to-end workflow -- from preprocessing and segmentation to model training and feature extraction -- that enables researchers to quantify the complexity, predictability, similarity, and novelty of vocal sequences.

Paper Structure

This paper contains 2 sections, 2 figures.

Table of Contents

  1. Summary
  2. Statement of Need

Figures (2)

  • Figure 1: A basic diagram of the chatter workflow, showing the progression from spectrograms to latent features to visualizations in 2D space. Note that all of the information theoretic analysis occurs in original latent space, not in the reduced 2D space.
  • Figure 2: The latent space of Cassin's vireo syllables. The plot visualizes the syllables in a 2D latent space produced by applying PaCMAP to the latent features from a variational autoencoder, with representative spectrograms overlaid.