onelearn: Online learning in Python

Last update: Nov 06, 2022

Overview

onelearn: Online learning in Python

Documentation | Reproduce experiments |

onelearn stands for ONE-shot LEARNning. It is a small python package for online learning with Python. It provides :

online (or one-shot) learning algorithms: each sample is processed once, only a single pass is performed on the data
including multi-class classification and regression algorithms
For now, only ensemble methods, namely Random Forests

Installation

The easiest way to install onelearn is using pip

pip install onelearn

But you can also use the latest development from github directly with

pip install git+https://github.com/onelearn/onelearn.git

References

@article{mourtada2019amf,
  title={AMF: Aggregated Mondrian Forests for Online Learning},
  author={Mourtada, Jaouad and Ga{\"\i}ffas, St{\'e}phane and Scornet, Erwan},
  journal={arXiv preprint arXiv:1906.10529},
  year={2019}
}

Comments

Unable to pickle AMFClassifier.
I would like to save the AMFClassifier, but am unable to pickle it. I have also tried to use dill or joblib, but they also don't seem to work.

Is there maybe another way to somehow export the AMFClassifier in any way, such that I can save it and load it in another kernel?

Below I added a snippet of code which reproduces the error. Note that only after the partial_fit method an error occurs when pickling. When the AMFClassifier has not been fit yet, pickling happens without problems, however, exporting an empty model is pretty useless.

Any help or tips is much appreciated.

from onelearn import AMFClassifier import dill as pickle from sklearn import datasets iris = datasets.load_iris() X = iris.data y = iris.target amf = AMFClassifier(n_classes=3) dump = pickle.dumps(amf) amf = pickle.loads(dump) amf.partial_fit(X,y) dump = pickle.dumps(amf) amf = pickle.loads(dump)
opened by w-feijen 1
Move experiments of the paper in a experiments folder
Update the documentation

Explain that we must clone the repo

Move also the short experiments to a examples folder and build a sphinx gallery with it
enhancement
opened by stephanegaiffas 1
Add some extra tests
Test that batch versus online training leads to the exact same forest

Test the behavior of reserve_samples, with several calls to partial_fit to check that memory is correctly allocated and

tests
opened by stephanegaiffas 1
What if predict_proba receives a single sample

get_amf_decision_online amf.partial_fit(X_train[iteration - 1], y_train[iteration - 1]) File "/Users/stephanegaiffas/Code/onelearn/onelearn/forest.py", line 259, in partial_fit n_samples, n_features = X.shape

opened by stephanegaiffas 1
Improve coverage

A problem is that @jit functions don't work with coverage... a workaround is to disable using the NUMBA_DISABLE_JIT environment variable, but breaks the code that use @jitclass and .class_type.instance_type attributes
enhancement bug fix

opened by stephanegaiffas 1

Releases(v0.3)

v0.3(Sep 29, 2021)
This release adds the following improvements

AMFClassifier and AMFRegressor can be serialized to files (using internally pickle) using the save and load methods

Source code(tar.gz)
Source code(zip)
v0.2.0(Apr 6, 2020)
This release adds the following improvements

SampleCollection pre-allocates more samples instead of the bare minimum for faster computation

The playground can be launched from the library

A documentation on readthedocs

Faster computations and a lot of code cleaning

Unittests for python 3.6-3.8

Source code(tar.gz)
Source code(zip)

Owner

GitHub Repository https://onelearn.readthedocs.io

SmartSim makes it easier to use common Machine Learning (ML) libraries like PyTorch and TensorFlow

SmartSim makes it easier to use common Machine Learning (ML) libraries like PyTorch and TensorFlow, in High Performance Computing (HPC) simulations and workloads.

139 Jan 01, 2023

A collection of Scikit-Learn compatible time series transformers and tools.

tsfeast A collection of Scikit-Learn compatible time series transformers and tools. Installation Create a virtual environment and install: From PyPi p

0 Mar 30, 2022

Traingenerator 🧙 A web app to generate template code for machine learning ✨

Traingenerator 🧙 A web app to generate template code for machine learning ✨ 🎉 Traingenerator is now live! 🎉

1.2k Jan 07, 2023

🌊 River is a Python library for online machine learning.

River is a Python library for online machine learning. It is the result of a merger between creme and scikit-multiflow. River's ambition is to be the go-to library for doing machine learning on strea

4k Jan 03, 2023

SIMD-accelerated bitwise hamming distance Python module for hexidecimal strings

hexhamming What does it do? This module performs a fast bitwise hamming distance of two hexadecimal strings. This looks like: DEADBEEF = 1101111010101

12 Oct 14, 2022

ClearML - Auto-Magical Suite of tools to streamline your ML workflow. Experiment Manager, MLOps and Data-Management

ClearML - Auto-Magical Suite of tools to streamline your ML workflow Experiment Manager, MLOps and Data-Management ClearML Formerly known as Allegro T

4k Jan 09, 2023

MBTR is a python package for multivariate boosted tree regressors trained in parameter space.

61 Dec 19, 2022

MICOM is a Python package for metabolic modeling of microbial communities

Welcome MICOM is a Python package for metabolic modeling of microbial communities currently developed in the Gibbons Lab at the Institute for Systems

57 Dec 21, 2022

Summer: compartmental disease modelling in Python

Summer: compartmental disease modelling in Python Summer is a Python-based framework for the creation and execution of compartmental (or "state-based"

6 May 13, 2022

Extended Isolation Forest for Anomaly Detection

Table of contents Extended Isolation Forest Summary Motivation Isolation Forest Extension The Code Installation Requirements Use Citation Releases Ext

377 Dec 18, 2022

A toolbox to iNNvestigate neural networks' predictions!

iNNvestigate neural networks! Table of contents Introduction Installation Usage and Examples More documentation Contributing Releases Introduction In

1.1k Jan 05, 2023

MLBox is a powerful Automated Machine Learning python library.

MLBox is a powerful Automated Machine Learning python library. It provides the following features: Fast reading and distributed data preprocessing/cle

1.4k Jan 06, 2023

A toolkit for making real world machine learning and data analysis applications in C++

dlib C++ library Dlib is a modern C++ toolkit containing machine learning algorithms and tools for creating complex software in C++ to solve real worl

11.6k Jan 02, 2023

whylogs: A Data and Machine Learning Logging Standard

whylogs: A Data and Machine Learning Logging Standard whylogs is an open source standard for data and ML logging whylogs logging agent is the easiest

2k Jan 06, 2023

Learning --> Numpy January 2022 - winter'22

Numerical-Python Numpy NumPy is a library for the Python programming language, adding support for large, multi-dimensional arrays and matrices, along

0 Mar 12, 2022

Optuna is an automatic hyperparameter optimization software framework, particularly designed for machine learning

Optuna is an automatic hyperparameter optimization software framework, particularly designed for machine learning. It features an imperative, define-by-run style user API.

7.4k Jan 04, 2023

K-Means clusternig example with Python and Scikit-learn

Unsupervised-Machine-Learning Flat Clustering K-Means clusternig example with Python and Scikit-learn Flat clustering Clustering algorithms group a se

1 Dec 13, 2021

Deep Survival Machines - Fully Parametric Survival Regression

Package: dsm Python package dsm provides an API to train the Deep Survival Machines and associated models for problems in survival analysis. The under

10 Dec 30, 2022

A demo project to elaborate how Machine Learn Models are deployed on production using Flask API

This is a salary prediction website developed with the help of machine learning, this makes prediction of salary on basis of few parameters like interview score, experience test score.

1 Feb 10, 2022

The MLOps is the process of continuous integration and continuous delivery of Machine Learning artifacts as a software product, keeping it inside a loop of Design, Model Development and Operations.

MLOps The MLOps is the process of continuous integration and continuous delivery of Machine Learning artifacts as a software product, keeping it insid

25 Nov 27, 2022

onelearn: Online learning in Python

Related tags

Overview

onelearn: Online learning in Python

Installation

References

Comments

Unable to pickle AMFClassifier.

Move experiments of the paper in a experiments folder

Add some extra tests

What if predict_proba receives a single sample

Improve coverage

Releases(v0.3)

v0.3(Sep 29, 2021)

v0.2.0(Apr 6, 2020)

Owner

SmartSim makes it easier to use common Machine Learning (ML) libraries like PyTorch and TensorFlow

A collection of Scikit-Learn compatible time series transformers and tools.

Traingenerator 🧙 A web app to generate template code for machine learning ✨

🌊 River is a Python library for online machine learning.

SIMD-accelerated bitwise hamming distance Python module for hexidecimal strings

ClearML - Auto-Magical Suite of tools to streamline your ML workflow. Experiment Manager, MLOps and Data-Management

MBTR is a python package for multivariate boosted tree regressors trained in parameter space.

MICOM is a Python package for metabolic modeling of microbial communities

Summer: compartmental disease modelling in Python

Extended Isolation Forest for Anomaly Detection

A toolbox to iNNvestigate neural networks' predictions!

MLBox is a powerful Automated Machine Learning python library.

A toolkit for making real world machine learning and data analysis applications in C++

whylogs: A Data and Machine Learning Logging Standard

Learning --> Numpy January 2022 - winter'22

Optuna is an automatic hyperparameter optimization software framework, particularly designed for machine learning

K-Means clusternig example with Python and Scikit-learn

Deep Survival Machines - Fully Parametric Survival Regression

A demo project to elaborate how Machine Learn Models are deployed on production using Flask API

The MLOps is the process of continuous integration and continuous delivery of Machine Learning artifacts as a software product, keeping it inside a loop of Design, Model Development and Operations.