A pytorch reprelication of the model-based reinforcement learning algorithm MBPO

Last update: Jan 05, 2023

Related tags

Overview

This is a re-implementation of the model-based RL algorithm MBPO in pytorch as described in the following paper: When to Trust Your Model: Model-Based Policy Optimization.

This code is based on a previous paper in the NeurIPS reproducibility challenge that reproduces the result with a tensorflow ensemble model but shows a significant drop in performance with a pytorch ensemble model. This code re-implements the ensemble dynamics model with pytorch and closes the gap.

Reproduced results

The comparison are done on two tasks while other tasks are not tested. But on the tested two tasks, the pytorch implementation achieves similar performance compared to the official tensorflow code.

Dependencies

MuJoCo 1.5 & MuJoCo 2.0

Usage

python main_mbpo.py --env_name 'Walker2d-v2' --num_epoch 300 --model_type 'pytorch'

python main_mbpo.py --env_name 'Hopper-v2' --num_epoch 300 --model_type 'pytorch'

Reference

Official tensorflow implementation: https://github.com/JannerM/mbpo
Code to the reproducibility challenge paper: https://github.com/jxu43/replication-mbpo

A pytorch reprelication of the model-based reinforcement learning algorithm MBPO

Related tags

Overview

Overview

Reproduced results

Dependencies

Usage

Reference

Owner

Xingyu Lin

A collection of easy-to-use, ready-to-use, interesting deep neural network models

Faune proche - Retrieval of Faune-France data near a google maps location

Graph Posterior Network: Bayesian Predictive Uncertainty for Node Classification (NeurIPS 2021)

Discover hidden deepweb pages

Python Multi-Agent Reinforcement Learning framework

Minimal implementation of PAWS (https://arxiv.org/abs/2104.13963) in TensorFlow.

Source code and data in paper "MDFEND: Multi-domain Fake News Detection (CIKM'21)"

Predict stock movement with Machine Learning and Deep Learning algorithms

Official repository with code and data accompanying the NAACL 2021 paper "Hurdles to Progress in Long-form Question Answering" (https://arxiv.org/abs/2103.06332).

[CVPR2022] Representation Compensation Networks for Continual Semantic Segmentation

Machine Learning toolbox for Humans

Deep Learning to Improve Breast Cancer Detection on Screening Mammography

机器学习、深度学习、自然语言处理等人工智能基础知识总结。

A Python Automated Machine Learning tool that optimizes machine learning pipelines using genetic programming.

This repository contains an overview of important follow-up works based on the original Vision Transformer (ViT) by Google.

SurvITE: Learning Heterogeneous Treatment Effects from Time-to-Event Data

Multi-Scale Aligned Distillation for Low-Resolution Detection (CVPR2021)

Implements pytorch code for the Accelerated SGD algorithm.

Kaggle G2Net Gravitational Wave Detection : 2nd place solution

Download and preprocess popular sequential recommendation datasets