Table of Contents
Fetching ...

FDApy: a Python package for functional data

Steven Golovkine

TL;DR

FDApy introduces an open-source Python library for functional data analysis that supports densely and irregularly sampled, multivariate, and multidimensional data, complemented by a simulation toolbox and visualization tools. It provides two principled routes for multivariate FPCA: (i) diagonalization of the covariance operator via univariate FPCA and pooled scores, and (ii) diagonalization of the inner-product Gram matrix to obtain multivariate eigenfunctions. The package emphasizes a robust, object-oriented implementation with FunctionalData and specialized subclasses, plus a MultivariateFunctionalData container and an MFPCA class offering covariance- and Gram-based decompositions. Applications to Canadian weather, Primary Biliary Cirrhosis, and NBA shooting illustrate practical performance and interpretation of principal modes in complex functional data. The work enables Python-based workflows for advanced FDA tasks and invites community contributions for regression and classification on irregular multivariate functional data, with full documentation and tests available.

Abstract

We introduce FDApy, an open-source Python package for the analysis of functional data. The package provides tools for the representation of (multivariate) functional data defined on different dimensional domains and for functional data that is irregularly sampled. Additionally, dimension reduction techniques are implemented for multivariate and/or multidimensional functional data that are regularly or irregularly sampled. A toolbox for generating functional datasets is also provided. The documentation includes installation and usage instructions, examples on simulated and real datasets and a complete description of the API. FDApy is released under the MIT license. The code and documentation are available at https://github.com/StevenGolovkine/FDApy.

FDApy: a Python package for functional data

TL;DR

FDApy introduces an open-source Python library for functional data analysis that supports densely and irregularly sampled, multivariate, and multidimensional data, complemented by a simulation toolbox and visualization tools. It provides two principled routes for multivariate FPCA: (i) diagonalization of the covariance operator via univariate FPCA and pooled scores, and (ii) diagonalization of the inner-product Gram matrix to obtain multivariate eigenfunctions. The package emphasizes a robust, object-oriented implementation with FunctionalData and specialized subclasses, plus a MultivariateFunctionalData container and an MFPCA class offering covariance- and Gram-based decompositions. Applications to Canadian weather, Primary Biliary Cirrhosis, and NBA shooting illustrate practical performance and interpretation of principal modes in complex functional data. The work enables Python-based workflows for advanced FDA tasks and invites community contributions for regression and classification on irregular multivariate functional data, with full documentation and tests available.

Abstract

We introduce FDApy, an open-source Python package for the analysis of functional data. The package provides tools for the representation of (multivariate) functional data defined on different dimensional domains and for functional data that is irregularly sampled. Additionally, dimension reduction techniques are implemented for multivariate and/or multidimensional functional data that are regularly or irregularly sampled. A toolbox for generating functional datasets is also provided. The documentation includes installation and usage instructions, examples on simulated and real datasets and a complete description of the API. FDApy is released under the MIT license. The code and documentation are available at https://github.com/StevenGolovkine/FDApy.

Paper Structure

This paper contains 15 sections, 14 equations, 11 figures.

Figures (11)

  • Figure 1: Canadian weather dataset. Each color represents a different Canadian weather station.
  • Figure 2: Primary Biliary Cirrhosis dataset. Each color represents a different patient. Each point represents an observation for a given patient. The observation grid is different for each patient.
  • Figure 3: NBA shooting dataset.
  • Figure 4: Schematic representation of the class hierarchy.
  • Figure 5: Canadian weather dataset mean and covariance.
  • ...and 6 more figures

Theorems & Definitions (1)

  • Remark 1