Python module for machine learning time series:

Last update: Dec 29, 2022

Overview

seglearn

Seglearn is a python package for machine learning time series or sequences. It provides an integrated pipeline for segmentation, feature extraction, feature processing, and final estimator. Seglearn provides a flexible approach to multivariate time series and related contextual (meta) data for classification, regression, and forecasting problems. Support and examples are provided for learning time series with classical machine learning and deep learning models. It is compatible with scikit-learn.

Documentation

Installation documentation, API documentation, and examples can be found on the documentation.

Dependencies

seglearn is tested to work under Python 3.5. The dependency requirements are based on the last scikit-learn release:

scipy(>=0.17.0)
numpy(>=1.11.0)
scikit-learn(>=0.21.3)

Additionally, to run the examples, you need:

matplotlib(>=2.0.0)
keras (>=2.1.4) for the neural network examples
pandas

In order to run the test cases, you need:

pytest

The neural network examples were tested on keras using the tensorflow-gpu backend, which is recommended.

Installation

seglearn-learn is currently available on the PyPi's repository and you can install it via pip:

pip install -U seglearn

or if you use python3:

pip3 install -U seglearn

If you prefer, you can clone it and run the setup.py file. Use the following commands to get a copy from GitHub and install all dependencies:

git clone https://github.com/dmbee/seglearn.git
cd seglearn
pip install .

Or install using pip and GitHub:

pip install -U git+https://github.com/dmbee/seglearn.git

Testing

After installation, you can use pytest to run the test suite from seglearn's root directory:

pytest

Change Log

Version history can be viewed in the Change Log.

Development

The development of this scikit-learn-contrib is in line with the one of the scikit-learn community. Therefore, you can refer to their Development Guide.

Please submit new pull requests on the dev branch with unit tests and an example to demonstrate any new functionality / api changes.

Citing seglearn

If you use seglearn in a scientific publication, we would appreciate citations to the following paper:

@article{arXiv:1803.08118,
author  = {David Burns, Cari Whyne},
title   = {Seglearn: A Python Package for Learning Sequences and Time Series},
journal = {arXiv},
year    = {2018},
url     = {https://arxiv.org/abs/1803.08118}
}

If you use the seglearn test data in a scientific publication, we would appreciate citations to the following paper:

@article{arXiv:1802.01489,
author  = {David Burns, Nathan Leung, Michael Hardisty, Cari Whyne, Patrick Henry, Stewart McLachlin},
title   = {Shoulder Physiotherapy Exercise Recognition: Machine Learning the Inertial Signals from a Smartwatch},
journal = {arXiv},
year    = {2018},
url     = {https://arxiv.org/abs/1802.01489}
}

Python module for machine learning time series:

Related tags

Overview

seglearn

Documentation

Dependencies

Installation

Testing

Change Log

Development

Citing seglearn

Owner

David Burns

Little Ball of Fur - A graph sampling extension library for NetworKit and NetworkX (CIKM 2020)

pure-predict: Machine learning prediction in pure Python

An open source framework that provides a simple, universal API for building distributed applications. Ray is packaged with RLlib, a scalable reinforcement learning library, and Tune, a scalable hyperparameter tuning library.

Machine Learning University: Accelerated Natural Language Processing Class

Machine Learning toolbox for Humans

50% faster, 50% less RAM Machine Learning. Numba rewritten Sklearn. SVD, NNMF, PCA, LinearReg, RidgeReg, Randomized, Truncated SVD/PCA, CSR Matrices all 50+% faster

Mixing up the Invariant Information clustering architecture, with self supervised concepts from SimCLR and MoCo approaches

Case studies with Bayesian methods

MaD GUI is a basis for graphical annotation and computational analysis of time series data.

A Python implementation of FastDTW

Regularization and Feature Selection in Least Squares Temporal Difference Learning

Turning images into '9-pan' palettes using KMeans clustering from sklearn.

Python library which makes it possible to dynamically mask/anonymize data using JSON string or python dict rules in a PySpark environment.

🎛 Distributed machine learning made simple.

Auto updating website that tracks closed & open issues/PRs on scikit-learn/scikit-learn.

Library of Stan Models for Survival Analysis

Implemented four supervised learning Machine Learning algorithms

STUMPY is a powerful and scalable Python library for computing a Matrix Profile, which can be used for a variety of time series data mining tasks

Distributed Deep learning with Keras & Spark

Data Version Control or DVC is an open-source tool for data science and machine learning projects