scikit-fda: A Python Package for Functional Data Analysis
Carlos Ramos-Carreño, José Luis Torrecilla, Miguel Carbajo-Berrocal, Pablo Marcos, Alberto Suárez
TL;DR
scikit-fda addresses the need for a comprehensive FDA toolkit in Python by providing two complementary representations for functional data—discretized grids and basis expansions—and a unified FData interface. The library enables full FDA workflows, including interpolation, derivatives, and regularization, plus powerful preprocessing (smoothing, registration, FPCA, variable selection) and exploratory analysis (depth, robust statistics, functional boxplots), all tightly integrated with scikit-learn pipelines. It also offers synthetic and real-world datasets, interactive visualization, and rigorous documentation and testing, facilitating reproducible research and easy adoption. By embedding FDA functionality within the Python ecosystem and adhering to BSD licensing, scikit-fda enhances accessibility, interoperability, and scalability for functional-data analysis in scientific computing and ML contexts.
Abstract
The library scikit-fda is a Python package for Functional Data Analysis (FDA). It provides a comprehensive set of tools for representation, preprocessing, and exploratory analysis of functional data. The library is built upon and integrated in Python's scientific ecosystem. In particular, it conforms to the scikit-learn application programming interface so as to take advantage of the functionality for machine learning provided by this package: pipelines, model selection, and hyperparameter tuning, among others. The scikit-fda package has been released as free and open-source software under a 3-Clause BSD license and is open to contributions from the FDA community. The library's extensive documentation includes step-by-step tutorials and detailed examples of use.
