MazeRL is an application oriented Deep Reinforcement Learning (RL) framework

Last update: Dec 24, 2022

Overview

Applied Reinforcement Learning with Python

MazeRL is an application oriented Deep Reinforcement Learning (RL) framework, addressing real-world decision problems. Our vision is to cover the complete development life cycle of RL applications ranging from simulation engineering up to agent development, training and deployment.

This is a preliminary, non-stable release of Maze. It is not yet complete and not all of our interfaces have settled yet. Hence, there might be some breaking changes on our way towards the first stable release.

Spotlight Features

Below we list a few selected Maze features.

Design and visualize your policy and value networks with the Perception Module. It is based on PyTorch and provides a large variety of neural network building blocks and model styles. Quickly compose powerful representation learners from building blocks such as: dense, convolution, graph convolution and attention, recurrent architectures, action- and observation masking, self-attention etc.
Create the conditions for efficient RL training without writing boiler plate code, e.g. by supporting best practices like pre-processing and normalizing your observations.
Maze supports advanced environment structures reflecting the requirements of real-world industrial decision problems such as multi-step and multi-agent scenarios. You can of course work with existing Gym-compatible environments.
Use the provided Maze trainers (A2C, PPO, Impala, SAC, Evolution Strategies), which are supporting dictionary action and observation spaces as well as multi-step (auto-regressive policies) training. Or stick to your favorite tools and trainers by combining Maze with other RL frameworks.
Out of the box support for advanced training workflows such as imitation learning from teacher policies and policy fine-tuning.
Keep even complex application and experiment configuration manageable with the Hydra Config System.

Get Started

Make sure PyTorch is installed and then get the latest released version of Maze as follows
```
pip install -U maze-rl

# optionally install RLLib if you want to use it in combination with Maze
pip install ray[rllib] tensorflow  
```
Read more about other options like the installation of the latest development version.

⚡ We encourage you to start with Python 3.7, as many popular environments like Atari or Box2D can not easily be installed in newer Python environments. Maze itself supports newer Python versions, but for Python 3.9 you might have to install additional binary dependencies manually
To see Maze in action check out a first example.
For a more applied introduction visit the step by step tutorial.

Installation

First Example

Step by Step Tutorial

Documentation

Learn more about Maze

The documentation is the starting point to learn more about the underlying concepts, but most importantly also provides code snippets and minimum working examples to get you started quickly.

The Workflow section guides you through typical tasks in a RL project
Policy and Value Networks introduces you to the Perception Module, how to customize action spaces and the underlying action probability distributions and two styles of policy and value networks construction:
- Template models are composed directly from an environment's observation and action space, allowing you to train with suitable agent networks on a new environment within minutes.
- Custom models gives you the full flexibility of application specific models, either with the provided Maze building blocks or directly with PyTorch.
Learn more about core concepts and structures such as the Maze environment hierarchy, the Maze event system providing a convenient way to collect statistics and KPIs, enable flexible reward formulation and supporting offline analysis.
Structured Environments and Action Masking introduces you to a general concept, which can greatly improve the performance of the trained agents in practical RL problems.

License

Maze is freely available for research and non-commercial use. A commercial license is available, if interested please contact us on our company website or write us an email.

We believe in Open Source principles and aim at transitioning Maze to a commercial Open Source project, releasing larger parts of the framework under a permissive license in the near future.

Comments

Configuration problems in the step-by-step tutorial
I've just been trying out maze and tried out the step-by-step tutorial.

In Step 5 (5. Training the MazeEnv) the instructions are incomplete or wrong.

I was able to get it running in the end, but it took (us) quite some time. I'm not sure if this is a bug in maze or hydra, of if just some newer version of either library changes the behavior a little bit. But you should update the documentation such that it works out of the box for new users of the library.

The setup (under Ubuntu 2020.04):

>> mkdir maze5 && cd maze5 >> pyenv local 3.8.8 >> python -m venv .venv >> source .venv/bin/activate >> pip install maze-rl torch >> pip list Package Version ----------------------- ----------- hydra-core 1.1.0 hydra-nevergrad-sweeper 1.1.5 maze-rl 0.1.7 torch 1.9.0 ...

Then just copy-pasted the files from the https://github.com/enlite-ai/maze-examples/tree/main/tutorial_maze_env/part03_maze_env repo and adjusted the _target paths in the config yamls (e.g. from _target_: tutorial_maze_env.part03_maze_env.env.maze_env.maze_env_factory to _target_: env.maze_env.maze_env_factory).

Problem 1:

When you run the suggested training command, Hydra will just complain that it can't find the configuration files.

>> maze-run -cn conf_train env=tutorial_cutting_2d_basic wrappers=tutorial_cutting_2d_basic \ model=tutorial_cutting_2d_basic algorithm=ppo In 'conf_train': Could not find 'model/tutorial_cutting_2d_basic' Available options in 'model': flatten_concat flatten_concat_shared_embedding pixel_obs pixel_obs_rnn rllib vector_obs vector_obs_rnn Config search path: provider=hydra, path=pkg://hydra.conf provider=main, path=pkg://maze.conf provider=schema, path=structured://

Fix:

You can just define the config directory for hydra with maze-run -cd conf -cn conf_train .... Then Hydra will find the 3 config files and load them correctly.

Problem 2:

After loading the config files, hydra tries to load the modules defined in the _target fields. And that fails immediatly with:

... File "***/maze5-uWAZh5bh/lib/python3.8/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 104, in _resolve_target return _locate(target) File "***/maze5-uWAZh5bh/lib/python3.8/site-packages/hydra/_internal/utils.py", line 563, in _locate raise ImportError(f"Error loading module '{path}'") from e ImportError: Error loading module 'env.maze_env.maze_env_factory'

Fix:

For some reason Hydra doesn't know the path to the directory from where we call maze-run. And therefore it doesn't find the env directory containing the maze_env file.

This is fixable by just setting the environment variable: export PYTHONPATH="$PYTHONPATH:$PWD/".
bug documentation
opened by jakobkogler 2
Hello from Hydra :)

Thanks for using Hydra! I see that you are using Hydra 1.1 already which is great. One thing that is really recent is the ability to configure the config searchpath from the primary config. You can learn about it here.

This can probably eliminate the need of your users to even know what a ConfigSearchpathPlugin is.

Feel free to jump into the Hydra chat if you have any questions.

opened by omry 2
Version 0.1.7
Adds Soft Actor-Critic (SAC) Trainer (supporting Dictionary Observations and Actions)

Simplifies the reward aggregation interface (now also supports multi-agent training)

Extends PPO and A2C to multi-agent capable actor-critic trainers (individual agents vs. centralized critic)

Adds option for custom rollout evaluators

Adds option for shared weights in actor-critic settings

Adds experiment and multi-run support for RunContext Python API
opened by enliteai 0
Version 0.1.6
Changes

made Maze compatible to Rllib 1.4

updated to the recently released hydra 1.1.0

Simpified API (RunContext): Experiment and evaluation support

Fixed support of the nevergrad sweeper: made the LocalLauncher hydra plugin part of the wheel

Replaced the (policy id, actor id) tuple with an ActorID class

Other

various documentation improvements

added ready-to-go Docker containers

contribution guidelines, pull request templates etc. on GitHub
opened by md-enlite 0
Version 0.1.5
Features:

Adds documentation for run_context

Changes of simulated environment interfaces step_without_observation -> fast_step

Adds seeding to environments, models and trainers

Initial commit of the Maze Python API

Adds an ExportGifWrapper

Adds network architecture visualizations to Tensorboard Images

adds incremental min/max stats

adds categorical (support-based) value networks

added value transformations
opened by md-enlite 0
Towards Version 0.1.5
Adds seeding to environments, models and trainers

Initial commit of the Maze Python API

Adds an ExportGifWrapper

Adds network architecture visualizations to Tensorboard Images
opened by md-enlite 0
Release Version 0.1.4
improved docs

switch to RLlib version 1.3.0.

full structured env support

policy interface now selects policy based on actor_id

added testing dependencies to main package
opened by enliteai 0
Dev
adds PointNetFeatureBlock to perception module

adds Tensorboard hyper paramter visualization for hydra multiruns

merges parallel and sequential dataset into a single InMemoryDataset
opened by md-enlite 0
Version 0.1.3
Improvements:

Enable event collection from within the Wrapper stack

Aligned StepSkipWrapper with the event system

MonitoringWrapper: Logging of observations, actions and rewards throughout the wrapper stack, useful for diagnosis

Make _recursive_ in Hydra config files compatible with Maze object instantiation
opened by enliteai 0
Version 0.1.2
Features:

Imitation Learning:

Added Evaluation Rollouts

Unified dataset structures (InMemoryDataset)

GlobalPoolingBlock: now supports sum and max pooling

ObservationNormalizationWrapper: Adds observation and observation distribution visualization to Tensorboard logging.

Distribution: Introduced VectorEnv, refactored the single and multi process parallelization wrappers.
opened by enliteai 0
Dev
Features:

hyper parameter optimization via grid search and Nevergrad

plain python training example

local hydra job launcher

extend attention/transformer perception blocks

Fixes:

cumulative stats logging
opened by md-enlite 0

Releases(v0.2.0)

v0.2.0(Nov 21, 2022)
New graph neural network building blocks (message passing based on torch-scatter in addition to existing graph convolutions)

Support for action recording, replay from pre-computed action records and feature collection.

Improved wrapper hierarchy semantics: Previously values were assigned to the outermost wrapper. Now values are assigned to existing attributes by traversing the wrapper hierarchy.

Removal of deprecated modules (APIContext and Maze models for RLlib)

Reflecting changes in upstream dependencies (Gym version pinned to <0.23)

Source code(tar.gz)
Source code(zip)
v0.1.8(Dec 13, 2021)
New Features

Agent Deployment Workflow

Soft Actor Critic from Demonstrations (SACfD)

Locally Distributed ES Runner

SpacesRecordingWrapper: Records and dumps processed trajectories to pickle files

Fixes event logging for environment resets and policy events

Source code(tar.gz)
Source code(zip)
submission_22-08-25-14-06.1.zip(252.75 MB)
v0.1.7(Jun 24, 2021)
Adds Soft Actor-Critic (SAC) Trainer (supporting Dictionary Observations and Actions)

Simplifies the reward aggregation interface (now also supports multi-agent training)

Extends PPO and A2C to multi-agent capable actor-critic trainers (individual agents vs. centralized critic)

Adds option for custom rollout evaluators

Adds option for shared weights in actor-critic settings

Adds experiment and multi-run support for RunContext Python API

Compatibility with PyTorch 1.9

Source code(tar.gz)
Source code(zip)
v0.1.6(Jun 14, 2021)
Changes

made Maze compatible to Rllib 1.4

updated to the recently released hydra 1.1.0

Simplified API (RunContext): Experiment and evaluation support

Fixed support of the nevergrad sweeper: made the LocalLauncher hydra plugin part of the wheel

Replaced the (policy id, actor id) tuple with an ActorID class

Other

various documentation improvements

added ready-to-go Docker containers

contribution guidelines, pull request templates etc. on GitHub

Source code(tar.gz)
Source code(zip)
v0.1.5(May 20, 2021)
Features:

adds RunContext (Maze Python API)

adds seeding to environments, models and trainers

changes of simulated environment interfaces step_without_observation -> fast_step

Improvements:

adds an ExportGifWrapper

adds network architecture visualizations to Tensorboard Images

adds incremental min/max stats

adds categorical (support-based) value networks

adds value transformations

Source code(tar.gz)
Source code(zip)
v0.1.4(Apr 29, 2021)
switch to RLlib version 1.3.0.

full structured env support

policy interface now selects policy based on actor_id

interfaces support collaborative multi-agent actor critic

improved docs

added testing dependencies to main package

Source code(tar.gz)
Source code(zip)
v0.1.3(Apr 1, 2021)
Improvements:

Enable event collection from within the Wrapper stack

Aligned StepSkipWrapper with the event system

MonitoringWrapper: Logging of observations, actions and rewards throughout the wrapper stack, useful for diagnosis

Make _recursive_ in Hydra config files compatible with Maze object instantiation

Source code(tar.gz)
Source code(zip)
v0.1.2(Mar 25, 2021)
Features:

Imitation Learning:

Added Evaluation Rollouts

Unified dataset structures (InMemoryDataset)

GlobalPoolingBlock: now supports sum and max pooling

ObservationNormalizationWrapper: Adds observation and observation distribution visualization to Tensorboard logging.

Distribution: Introduced VectorEnv, refactored the single and multi process parallelization wrappers.

Source code(tar.gz)
Source code(zip)
v0.1.1(Mar 18, 2021)
Features:

hyper parameter optimization via grid search and Nevergrad

plain python training example

local hydra job launcher

extend attention/transformer perception blocks

adds MazeEnvMonitoringWrapper as a default to wrapper stacks

Fixes:

cumulative stats logging

Source code(tar.gz)
Source code(zip)
v0.1.0(Mar 11, 2021)
Documentation updates:

Integrating existing Gym environments

Factory documentation

Experiments workflow, ...

Updated to Hydra 1.1.0:

Using Hydra.instantiate instead of custom registry implementation

Added Rollout evaluator
Source code(tar.gz)
Source code(zip)

Owner

EnliteAI GmbH

enliteAI is a machine learning company, developing the Reinforcement Learning framework Maze.

GitHub Repository https://maze-rl.readthedocs.io/

2D Time independent Schrodinger equation solver for arbitrary shape of well

Schrodinger Well Python Python solver for timeless Schrodinger equation for well with arbitrary shape https://imgur.com/a/jlhK7OZ Pictures of circular

24 Nov 18, 2022

A unified 3D Transformer Pipeline for visual synthesis

Overview This is the official repo for the paper: "NÜWA: Visual Synthesis Pre-training for Neural visUal World creAtion". NÜWA is a unified multimodal

2.6k Jan 03, 2023

This application explain how we can easily integrate Deepface framework with Python Django application

deepface_suite This application explain how we can easily integrate Deepface framework with Python Django application install redis cache install requ

3 Apr 18, 2022

A repository that shares tuning results of trained models generated by TensorFlow / Keras. Post-training quantization (Weight Quantization, Integer Quantization, Full Integer Quantization, Float16 Quantization), Quantization-aware training. TensorFlow Lite. OpenVINO. CoreML. TensorFlow.js. TF-TRT. MediaPipe. ONNX. [.tflite,.h5,.pb,saved_model,tfjs,tftrt,mlmodel,.xml/.bin, .onnx]

PINTO_model_zoo Please read the contents of the LICENSE file located directly under each folder before using the model. My model conversion scripts ar

2.4k Jan 05, 2023

Offcial repository for the IEEE ICRA 2021 paper Auto-Tuned Sim-to-Real Transfer.

47 Jun 30, 2022

Code for the paper "Generative design of breakwaters usign deep convolutional neural network as a surrogate model"

Generative design of breakwaters usign deep convolutional neural network as a surrogate model This repository contains the code for the paper "Generat

2 Apr 10, 2022

CKD - Collaborative Knowledge Distillation for Heterogeneous Information Network Embedding

Collaborative Knowledge Distillation for Heterogeneous Information Network Embed

9 Dec 05, 2022

Semi-supervised semantic segmentation needs strong, varied perturbations

Semi-supervised semantic segmentation using CutMix and Colour Augmentation Implementations of our papers: Semi-supervised semantic segmentation needs

146 Dec 20, 2022

Multiple Object Tracking with Yolov5!

Tracking with yolov5 This implementation is for who need to tracking multi-object only with detector. You can easily track mult-object with your well

9 Nov 08, 2022

[Official] Exploring Temporal Coherence for More General Video Face Forgery Detection(ICCV 2021)

Exploring Temporal Coherence for More General Video Face Forgery Detection(FTCN) Yinglin Zheng, Jianmin Bao, Dong Chen, Ming Zeng, Fang Wen Accepted b

57 Dec 28, 2022

Position detection system of mobile robot in the warehouse enviroment

1 Nov 24, 2021

Visual Tracking by TridenAlign and Context Embedding

Visual Tracking by TridentAlign and Context Embedding (TACT) Test code for "Visual Tracking by TridentAlign and Context Embedding" Janghoon Choi, Juns

32 Aug 25, 2021

Code for "Multi-Time Attention Networks for Irregularly Sampled Time Series", ICLR 2021.

Multi-Time Attention Networks (mTANs) This repository contains the PyTorch implementation for the paper Multi-Time Attention Networks for Irregularly

68 Dec 17, 2022

External Attention Network

Beyond Self-attention: External Attention using Two Linear Layers for Visual Tasks paper : https://arxiv.org/abs/2105.02358 EAMLP will come soon Jitto

357 Dec 11, 2022

HyperCube: Implicit Field Representations of Voxelized 3D Models

HyperCube: Implicit Field Representations of Voxelized 3D Models Authors: Magdalena Proszewska, Marcin Mazur, Tomasz Trzcinski, Przemysław Spurek [Pap

3 Mar 09, 2022

Pretrained models for Jax/Flax: StyleGAN2, GPT2, VGG, ResNet.

169 Dec 26, 2022

Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network

Efficient and Accurate Arbitrary-Shaped Text Detection with Pixel Aggregation Network Paddle-PANet 目录结果对比论文介绍快速安装结果对比 CTW1500 Method Backbone Fine

7 Aug 08, 2022

(3DV 2021 Oral) Filtering by Cluster Consistency for Large-Scale Multi-Image Matching

Scalable Cluster-Consistency Statistics for Robust Multi-Object Matching (3DV 2021 Oral Presentation) Filtering by Cluster Consistency (FCC) is a very

11 Sep 28, 2022

The official implementation for "FQ-ViT: Fully Quantized Vision Transformer without Retraining".

FQ-ViT [arXiv] This repo contains the official implementation of "FQ-ViT: Fully Quantized Vision Transformer without Retraining". Table of Contents In

132 Jan 08, 2023

Minimal implementation of Denoised Smoothing: A Provable Defense for Pretrained Classifiers in TensorFlow.

Denoised-Smoothing-TF Minimal implementation of Denoised Smoothing: A Provable Defense for Pretrained Classifiers in TensorFlow. Denoised Smoothing is

19 Dec 11, 2022