Code for "Hierarchical Skills for Efficient Exploration" HSD-3 Algorithm and Baselines

Last update: Dec 06, 2022

Related tags

Deep Learning hsd3

Overview

Hierarchical Skills for Efficient Exploration

This is the source code release for the paper Hierarchical Skills for Efficient Exploration. It contains

Code for pre-training and hierarchical learning with HSD-3
Code for the baselines we compare to in the paper

Additionally, we provide pre-trained skill policies for the Walker and Humanoid robots considered in the paper.

The benchmark suite can be found in a standalone repository at facebookresearch/bipedal-skills

Prerequisites

Install PyTorch according to the official instructions, for example in a new conda environment. This code-base was tested with PyTorch 1.8 and 1.9.

Then, install remaining requirements via

pip install -r requirements.txt

For optimal performance, we also recommend installing NVidia's PyTorch extensions.

Usage

We use Hydra to handle training configurations, with some defaults that might not make everyone happy. In particular, we disable the default job directory management -- which is good for local development but not desirable for running full experiments. This can be changed by adapting the initial portion of config/common.yaml or by passing something like hydra.run.dir=./outputs/my-custom-string to the commands below.

Pre-training Hierarchical Skills

For pre-training skill policies, use the pretrain.py script (note that this requires a machine with 2 GPUs):

# Walker robot
python pretrain.py -cn walker_pretrain
# Humanoid robot
python pretrain.py -cn humanoid_pretrain

Hierarchical Control

High-level policy training with HSD-3 is done as follows:

# Walker robot
python train.py -cn walker_hsd3
# Humanoid robot
python train.py -cn humanoid_hsd3

The default configuration assumes that a pre-trained skill policy is available at checkpoint-lo.pt. The location can be overriden by setting a new value for agent.lo.init_from (see below for an example). By default, a high-level agent will be trained on the "Hurdles" task. This can be changed by passing env.name=BiskStairs-v1, for example.

Pre-trained skill policies are available here. After unpacking the archive in the top-level directory of this repository, they can be used as follows:

# Walker robot
python train.py -cn walker_hsd3 agent.lo.init_from=$PWD/pretrained-skills/walker.pt
# Humanoid robot
python train.py -cn humanoid_hsd3 agent.lo.init_from=$PWD/pretrained-skills/humanoidpc.pt

Baselines

Individual baselines can be run by passing the following as the -cn argument to train.py (for the Walker robot):

Baseline	Configuration name
Soft Actor-Critic	`walker_sac`
DIAYN-C pre-training	`walker_diaync_pretrain`
DIAYN-C HRL	`walker_diaync_hrl`
HIRO-SAC	`walker_hiro`
Switching Ensemble	`walker_se`
HSD-Bandit	`walker_hsdb`
SD	`walker_sd`

By default, walker_sd will select the full goal space. Other goal spaces can be selected by modifying the configuration, e.g., passing subsets=2-3+4 will limit high-level control to X translation (2) and the left foot (3+4).

License

hsd3 is MIT licensed, as found in the LICENSE file.

Code for "Hierarchical Skills for Efficient Exploration" HSD-3 Algorithm and Baselines

Related tags

Overview

Hierarchical Skills for Efficient Exploration

Prerequisites

Usage

Pre-training Hierarchical Skills

Hierarchical Control

Baselines

License

Owner

Facebook Research

This is the implementation of GGHL (A General Gaussian Heatmap Labeling for Arbitrary-Oriented Object Detection)

PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis

buildseg is a building extraction plugin of QGIS based on PaddlePaddle.

Code for Towards Unifying Behavioral and Response Diversity for Open-ended Learning in Zero-sum Games

An SMPC companion library for Syft

code for paper "Not All Unlabeled Data are Equal: Learning to Weight Data in Semi-supervised Learning" by Zhongzheng Ren, Raymond A. Yeh, Alexander G. Schwing.

A python module for configuration of block devices

Progressive Coordinate Transforms for Monocular 3D Object Detection

《Truly shift-invariant convolutional neural networks》(2021)

Independent and minimal implementations of some reinforcement learning algorithms using PyTorch (including PPO, A3C, A2C, ...).

Code release for "COTR: Correspondence Transformer for Matching Across Images"

2nd solution of ICDAR 2021 Competition on Scientific Literature Parsing, Task B.

Making self-supervised learning work on molecules by using their 3D geometry to pre-train GNNs. Implemented in DGL and Pytorch Geometric.

Yolov5 + Deep Sort with PyTorch

PyTorch Implementations for DeeplabV3 and PSPNet

PyTorch Code for NeurIPS 2021 paper Anti-Backdoor Learning: Training Clean Models on Poisoned Data.

FCOS: Fully Convolutional One-Stage Object Detection (ICCV'19)

OOD Generalization and Detection (ACL 2020)

Weakly Supervised Segmentation with Tensorflow. Implements instance segmentation as described in Simple Does It: Weakly Supervised Instance and Semantic Segmentation, by Khoreva et al. (CVPR 2017).

Pytorch implementation for "Large-Scale Long-Tailed Recognition in an Open World" (CVPR 2019 ORAL)

Code for "Hierarchical Skills for Efficient Exploration" HSD-3 Algorithm and Baselines

Related tags

Overview

Hierarchical Skills for Efficient Exploration

Prerequisites

Usage

Pre-training Hierarchical Skills

Hierarchical Control

Baselines

License

Owner

Facebook Research

This is the implementation of GGHL (A General Gaussian Heatmap Labeling for Arbitrary-Oriented Object Detection)

PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis

buildseg is a building extraction plugin of QGIS based on PaddlePaddle.

Code for Towards Unifying Behavioral and Response Diversity for Open-ended Learning in Zero-sum Games

An SMPC companion library for Syft

code for paper "Not All Unlabeled Data are Equal: Learning to Weight Data in Semi-supervised Learning" by Zhongzheng Ren*, Raymond A. Yeh*, Alexander G. Schwing.

A python module for configuration of block devices

Progressive Coordinate Transforms for Monocular 3D Object Detection

《Truly shift-invariant convolutional neural networks》(2021)

Independent and minimal implementations of some reinforcement learning algorithms using PyTorch (including PPO, A3C, A2C, ...).

Code release for "COTR: Correspondence Transformer for Matching Across Images"

2nd solution of ICDAR 2021 Competition on Scientific Literature Parsing, Task B.

Making self-supervised learning work on molecules by using their 3D geometry to pre-train GNNs. Implemented in DGL and Pytorch Geometric.

Yolov5 + Deep Sort with PyTorch

PyTorch Implementations for DeeplabV3 and PSPNet

PyTorch Code for NeurIPS 2021 paper Anti-Backdoor Learning: Training Clean Models on Poisoned Data.

FCOS: Fully Convolutional One-Stage Object Detection (ICCV'19)

OOD Generalization and Detection (ACL 2020)

Weakly Supervised Segmentation with Tensorflow. Implements instance segmentation as described in Simple Does It: Weakly Supervised Instance and Semantic Segmentation, by Khoreva et al. (CVPR 2017).

Pytorch implementation for "Large-Scale Long-Tailed Recognition in an Open World" (CVPR 2019 ORAL)

code for paper "Not All Unlabeled Data are Equal: Learning to Weight Data in Semi-supervised Learning" by Zhongzheng Ren, Raymond A. Yeh, Alexander G. Schwing.