Offline Multi-Agent Reinforcement Learning Implementations: Solving Overcooked Game with Data-Driven Method

Last update: Sep 16, 2022

Related tags

Overview

Overcooked-AI

We suppose to apply traditional offline reinforcement learning technique to multi-agent algorithm.
In this repository, we implemented behavior cloning(BC), offline MADDPG, MADDPG+REM (MADDPG w/ REM), MADDPG+BCQ (MADDPG w/ BCQ) with pytorch. Now, BCQ is in ' Working In Progress', and it's not implemented completely.

We collected 0.5M multi-agent offline RL dataset and experimented with each comparison methods. We collected this data with online MADDPG agents, and it includes exploration trajectories using OU noise. The experiments are ran on Asymmetric Advantages on the Overcooked environment.

We are looking forward your contribution!

How to Run

Collect Offline Data

python train_online.py agent=maddpg save_replay_buffer=true

While the agents train with 0.5M steps, the trajectory replay buffer will be dumped in your experiment/{date}/{time}_maddpg_{exp_name}/buffer folder.
Please replace the path in config/data/local.yaml to the experiment by-product directory.

Download Dataset

Or, if you want to use our dataset pre-collected, please enjoy this link.
We provide 0.5M trajectories in Asymmetric Advantages layout.
Please download our dataset in your local computer and replace the path in config/data/local.yaml

Train Offline Models

Behavior Cloning

python train_bc.py agent=bc data=local

Offline MADDPG (Vanilla)

python train_offline.py agent=maddpg data=local

Offline MADDPG (w/ REM)

python train_offline.py agent=rem_maddpg data=local

Offline MADDPG (w/ BCQ) (WIP)

python train_offline.py agent=bcq_maddpg data=local

Result

Graph

Online	Offline (0.5M Data)	Offline (0.25M Data)

Video

Online	BC	Offline /w REM

Offline Multi-Agent Reinforcement Learning Implementations: Solving Overcooked Game with Data-Driven Method

Related tags

Overview

Overcooked-AI

How to Run

Collect Offline Data

Download Dataset

Train Offline Models

Behavior Cloning

Offline MADDPG (Vanilla)

Offline MADDPG (w/ REM)

Offline MADDPG (w/ BCQ) (WIP)

Result

Graph

Video

Acknowledgement

Owner

Baek In-Chang

PyTorch Implementation of Temporal Output Discrepancy for Active Learning, ICCV 2021

Aalto-cs-msc-theses - Listing of M.Sc. Theses of the Department of Computer Science at Aalto University

VarCLR: Variable Semantic Representation Pre-training via Contrastive Learning

Deep Federated Learning for Autonomous Driving

Implementation of the paper "Fine-Tuning Transformers: Vocabulary Transfer"

Exponential Graph is Provably Efficient for Decentralized Deep Training

Torchserve server using a YoloV5 model running on docker with GPU and static batch inference to perform production ready inference.

TensorFlow port of PyTorch Image Models (timm) - image models with pretrained weights.

Fuwa-http - The http client implementation for the fuwa eco-system

(ICONIP 2020) MobileHand: Real-time 3D Hand Shape and Pose Estimation from Color Image

PyGAD, a Python 3 library for building the genetic algorithm and training machine learning algorithms (Keras & PyTorch).

This repo is duplication of jwyang/faster-rcnn.pytorch

Exploring Visual Engagement Signals for Representation Learning

Deepface is a lightweight face recognition and facial attribute analysis (age, gender, emotion and race) framework for python

TJU Deep Learning & Neural Network

Pytorch implementation of the paper "Class-Balanced Loss Based on Effective Number of Samples"

Python code for the paper How to scale hyperparameters for quickshift image segmentation

Exploring Classification Equilibrium in Long-Tailed Object Detection, ICCV2021

Gradient representations in ReLU networks as similarity functions

A minimalist implementation of score-based diffusion model