Learning Off-Policy with Online Planning, CoRL 2021

Last update: Nov 22, 2022

Related tags

Deep Learning LOOP

Overview

LOOP: Learning Off-Policy with Online Planning

Accepted in Conference of Robot Learning (CoRL) 2021.

Harshit Sikchi, Wenxuan Zhou, David Held

Paper

Install

PyTorch 1.5
OpenAI Gym
MuJoCo
tqdm
D4RL dataset

File Structure

LOOP (Core method)
- Training code (Online RL): train_loop_sac.py
- Training code (Offline RL): train_loop_offline.py
- Training code (safe RL): train_loop_safety.py
- Policies (online/offline/safety): policies.py
- ARC/H-step lookahead policy: controllers/
Environments: envs/
Configurations: configs/

Instructions

All the experiments are to be run under the root folder.
Config files in configs/ are used to specify hyperparameters for controllers and dynamics.
Please keep all the other values in yml files consistent with hyperparamters given in paper to reproduce the results in our paper.

Experiments

Sec 6.1 LOOP for Online RL

python train_loop_sac.py --env=<env_name> --policy=LOOP_SAC_ARC --start_timesteps=<initial exploration steps> --exp_name=<location_to_logs>

Environments wrappers with their termination condition can be found under envs/

Sec 6.2 LOOP for Offline RL

Download CRR trained models from Link into the root folder.

python train_loop_offline.py --env=<env_name> --policy=LOOP_OFFLINE_ARC --exp_name=<location_to_logs>  --offline_algo=CRR --prior_type=CRR

Currently supported for d4rl MuJoCo locomotions tasks only.

Sec 6.3 LOOP for Safe RL

python train_loop_safety.py --env=<env_name> --policy=safeLOOP_ARC --exp_name=<location_to_logs>

Safety environments can be found under envs/safety_envs.py

References

Parts of the codes are used from the references mentioned below:

@article{SpinningUp2018,
    author = {Achiam, Joshua},
    title = {{Spinning Up in Deep Reinforcement Learning}},
    year = {2018}
}

https://github.com/Xingyu-Lin/mbpo_pytorch

Comments

Environment reproducibility

Hi, I am trying to run your code. However, I am trying to get packages prepared on newest version and have been encountering errors such as with mpi4py which does not install correctly in my environment.

Is it possible for you guys to provide a requirements.txt file for me to generate the python virtual environment that will set up the dependencies to run the code? Otherwise a container image such as docker will also be great!

opened by pranjaldhole 0

Official Implementation of 'UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers' ICLR 2021(spotlight)

UPDeT Official Implementation of UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers (ICLR 2021 spotlight) The

96 Dec 22, 2022

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

pytorch-a2c-ppo-acktr Update (April 12th, 2021) PPO is great, but Soft Actor Critic can be better for many continuous control tasks. Please check out

3k Jan 9, 2023

3k Dec 31, 2022

Official repository for "PAIR: Planning and Iterative Refinement in Pre-trained Transformers for Long Text Generation"

pair-emnlp2020 Official repository for the paper: Xinyu Hua and Lu Wang: PAIR: Planning and Iterative Refinement in Pre-trained Transformers for Long

31 Oct 13, 2022

Simple streamlit app to demonstrate HERE Tour Planning

Table of Contents About the Project Built With Getting Started Prerequisites Installation Usage Roadmap Contributing License Acknowledgements About Th

8 Sep 5, 2022

This is the official repo for TransFill: Reference-guided Image Inpainting by Merging Multiple Color and Spatial Transformations at CVPR'21. According to some product reasons, we are not planning to release the training/testing codes and models. However, we will release the dataset and the scripts to prepare the dataset.

TransFill-Reference-Inpainting This is the official repo for TransFill: Reference-guided Image Inpainting by Merging Multiple Color and Spatial Transf

80 Dec 8, 2022

An all-in-one application to visualize multiple different local path planning algorithms

Learning Off-Policy with Online Planning, CoRL 2021

Related tags

Overview

LOOP: Learning Off-Policy with Online Planning

Install

File Structure

Instructions

Experiments

Sec 6.1 LOOP for Online RL

Sec 6.2 LOOP for Offline RL

Sec 6.3 LOOP for Safe RL

References

You might also like...

Official Implementation of 'UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers' ICLR 2021(spotlight)

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

Official repository for "PAIR: Planning and Iterative Refinement in Pre-trained Transformers for Long Text Generation"

Simple streamlit app to demonstrate HERE Tour Planning

An all-in-one application to visualize multiple different local path planning algorithms

GNPy: Optical Route Planning and DWDM Network Optimization

Memory-efficient optimum einsum using opt_einsum planning and PyTorch kernels.

Comments

Environment reproducibility

Releases(v0.0.0)

v0.0.0(Aug 27, 2022)

Owner

Harshit Sikchi

Introducing neural networks to predict stock prices

Open source implementation of AceNAS: Learning to Rank Ace Neural Architectures with Weak Supervision of Weight Sharing

Lip Reading - Cross Audio-Visual Recognition using 3D Convolutional Neural Networks

A curated list of Generative Deep Art projects, tools, artworks, and models

1st Solution For ICDAR 2021 Competition on Mathematical Formula Detection

This is the codebase for the ICLR 2021 paper Trajectory Prediction using Equivariant Continuous Convolution

Code to reproduce experiments in the paper "Explainability Requires Interactivity".

Predicting path with preference based on user demonstration using Maximum Entropy Deep Inverse Reinforcement Learning in a continuous environment

A repository for benchmarking neural vocoders by their quality and speed.

It's final year project of Diploma Engineering. This project is based on Computer Vision.

KaziText is a tool for modelling common human errors.

Code for "Sparse Steerable Convolutions: An Efficient Learning of SE(3)-Equivariant Features for Estimation and Tracking of Object Poses in 3D Space"

A simple python stock Predictor

Official implementation of Deep Convolutional Dictionary Learning for Image Denoising.

An implementation of "Optimal Textures: Fast and Robust Texture Synthesis and Style Transfer through Optimal Transport"

Codes of the paper Deformable Butterfly: A Highly Structured and Sparse Linear Transform.

FaceOcc: A Diverse, High-quality Face Occlusion Dataset for Human Face Extraction

The most simple and minimalistic navigation dashboard.

PyTorch Implementation of Temporal Output Discrepancy for Active Learning, ICCV 2021

A simple Rock-Paper-Scissors game using CV in python