ShinRL: A Library for Evaluating RL Algorithms from Theoretical and Practical Perspectives

Last update: Dec 28, 2022

Related tags

Overview

Status: Under development (expect bug fixes and huge updates)

ShinRL: A Library for Evaluating RL Algorithms from Theoretical and Practical Perspectives

ShinRL is an open-source JAX library specialized for the evaluation of reinforcement learning (RL) algorithms from both theoretical and practical perspectives. Please take a look at the paper for details.

QuickStart

Try ShinRL at: experiments/QuickStart.ipynb.

import gym
from shinrl import DiscreteViSolver
import matplotlib.pyplot as plt

# make an env & a config
env = gym.make("ShinPendulum-v0")
config = DiscreteViSolver.DefaultConfig(explore="eps_greedy", approx="nn", steps_per_epoch=10000)

# make mixins
mixins = DiscreteViSolver.make_mixins(env, config)
# mixins == [DeepRlStepMixIn, QTargetMixIn, TbInitMixIn, NetActMixIn, NetInitMixIn, ShinExploreMixIn, ShinEvalMixIn, DiscreteViSolver]

# (optional) arrange mixins
# mixins.insert(2, UserDefinedMixIn)

# make & run a solver
dqn_solver = DiscreteViSolver.factory(env, config, mixins)
dqn_solver.run()

# plot performance
returns = dqn_solver.scalars["Return"]
plt.plot(returns["x"], returns["y"])

# plot learned q-values  (act == 0)
q0 = dqn_solver.tb_dict["Q"][:, 0]
env.plot_S(q0, title="Learned")

# plot oracle q-values  (act == 0)
q0 = env.calc_q(dqn_solver.tb_dict["ExploitPolicy"])[:, 0]
env.plot_S(q0, title="Oracle")

# plot optimal q-values  (act == 0)
q0 = env.calc_optimal_q()[:, 0]
env.plot_S(q0, title="Optimal")

⚡ Key Modules

ShinRL consists of two main modules:

ShinEnv: Implement relatively small MDP environments with access to the oracle quantities.
Solver: Solve the environments (e.g., finding the optimal policy) with specified algorithms.

🔬 ShinEnv for Oracle Analysis

ShinEnv provides small environments with oracle methods that can compute exact quantities:
- calc_q computes a Q-value table containing all possible state-action pairs given a policy.
- calc_optimal_q computes the optimal Q-value table.
- calc_visit calculates state visitation frequency table, for a given policy.
- calc_return is a shortcut for computing exact undiscounted returns for a given policy.
Some environments support continuous action space and image observation. See the following table and shinrl/envs/__init__.py for the available environments.

Environment	Dicrete action	Continuous action	Image Observation	Tuple Observation
ShinMaze	✔️	❌	❌	✔️
ShinMountainCar-v0	✔️	✔️	✔️	✔️
ShinPendulum-v0	✔️	✔️	✔️	✔️
ShinCartPole-v0	✔️	✔️	❌	✔️

🏭 Flexible Solver by MixIn

A "mixin" is a class which defines and implements a single feature. ShinRL's solvers are instantiated by mixing some mixins.
By arranging mixins, you can easily implement your own idea on the ShinRL's code base. See experiments/QuickStart.ipynb for example.
The following code demonstrates how different mixins turn into "value iteration" and "deep Q learning":

import gym
from shinrl import DiscreteViSolver

env = gym.make("ShinPendulum-v0")

# run value iteration (dynamic programming)
config = DiscreteViSolver.DefaultConfig(approx="tabular", explore="oracle")
mixins = DiscreteViSolver.make_mixins(env, config)
# mixins == [TabularDpStepMixIn, QTargetMixIn, TbInitMixIn, ShinExploreMixIn, ShinEvalMixIn, DiscreteViSolver]
vi_solver = DiscreteViSolver.factory(env, config, mixins)
vi_solver.run()

# run deep Q learning 
config = DiscreteViSolver.DefaultConfig(approx="nn", explore="eps_greedy")
mixins = DiscreteViSolver.make_mixins(env, config)  
# mixins == [DeepRlStepMixIn, QTargetMixIn, TbInitMixIn, NetActMixIn, NetInitMixIn, ShinExploreMixIn, ShinEvalMixIn, DiscreteViSolver]
dql_solver = DiscreteViSolver.factory(env, config, mixins)
dql_solver.run()

# ShinRL also provides deep RL solvers with OpenAI Gym environment supports.
env = gym.make("CartPole-v0")
mixins = DiscreteViSolver.make_mixins(env, config)  
# mixins == [DeepRlStepMixIn, QTargetMixIn, TargetMixIn, NetActMixIn, NetInitMixIn, GymExploreMixIn, GymEvalMixIn, DiscreteViSolver]
dql_solver = DiscreteViSolver.factory(env, config, mixins)
dql_solver.run()

Installation

git clone [email protected]:omron-sinicx/ShinRL.git
cd ShinRL
pip install -e .

Test

cd ShinRL
make test

Format

cd ShinRL
make format

Docker

cd ShinRL
docker-compose up

Citation

# Neurips DRL WS 2021 version
@inproceedings{toshinori2021shinrl,
    author = {Kitamura, Toshinori and Yonetani, Ryo},
    title = {ShinRL: A Library for Evaluating RL Algorithms from Theoretical and Practical Perspectives},
    year = {2021},
    booktitle = {Proceedings of the NeurIPS Deep RL Workshop},
}

# Arxiv version
@article{toshinori2021shinrlArxiv,
    author = {Kitamura, Toshinori and Yonetani, Ryo},
    title = {ShinRL: A Library for Evaluating RL Algorithms from Theoretical and Practical Perspectives},
    year = {2021},
    url = {https://arxiv.org/abs/2112.04123},
    journal={arXiv preprint arXiv:2112.04123},
}

ShinRL: A Library for Evaluating RL Algorithms from Theoretical and Practical Perspectives

Related tags

Overview

ShinRL: A Library for Evaluating RL Algorithms from Theoretical and Practical Perspectives

QuickStart

⚡ Key Modules

🔬 ShinEnv for Oracle Analysis

🏭 Flexible Solver by MixIn

Installation

Test

Format

Docker

Citation

Owner

Training Very Deep Neural Networks Without Skip-Connections

Texture mapping with variational auto-encoders

An Open-Source Package for Information Retrieval.

DeepI2I: Enabling Deep Hierarchical Image-to-Image Translation by Transferring from GANs

Implementation of CVPR 2021 paper "Spatially-invariant Style-codes Controlled Makeup Transfer"

Temporal Dynamic Convolutional Neural Network for Text-Independent Speaker Verification and Phonemetic Analysis

Neighbor2Seq: Deep Learning on Massive Graphs by Transforming Neighbors to Sequences

Volumetric parameterization of the placenta to a flattened template

PN-Net a neural field-based framework for depth estimation from single-view RGB images.

Attack on Confidence Estimation algorithm from the paper "Disrupting Deep Uncertainty Estimation Without Harming Accuracy"

Notebook and code to synthesize complex and highly dimensional datasets using Gretel APIs.

Generative Models as a Data Source for Multiview Representation Learning

General Assembly Capstone: NBA Game Predictor

Self-Supervised Monocular DepthEstimation with Internal Feature Fusion(arXiv), BMVC2021

The Official TensorFlow Implementation for SPatchGAN (ICCV2021)

A GOOD REPRESENTATION DETECTS NOISY LABELS

Rank 1st in the public leaderboard of ScanRefer (2021-03-18)

Creating a custom CNN hypertunned architeture for the Fashion MNIST dataset with Python, Keras and Tensorflow.

SFD implement with pytorch

Computer Vision is an elective course of MSAI, SCSE, NTU, Singapore