Code for the paper "Functional Regularization for Reinforcement Learning via Learned Fourier Features"

Overview

Reinforcement Learning with Learned Fourier Features

State-space Soft Actor-Critic Experiments

Move to the state-SAC-LFF repository.

cd state-SAC-LFF

To install the dependencies, use the provided environment.yml file

conda env create -f environment.yml

To run an experiment, the template for MLP and LFF experiments, respectively, are:

python main.py --policy PytorchSAC --env dm.quadruped.run --start_timesteps 5000 --hidden_dim 1024 --batch_size 1024 --n_hidden 3
python main.py --policy PytorchSAC --env dm.quadruped.run --start_timesteps 5000 --hidden_dim 1024 --batch_size 1024 --n_hidden 2 \
               --network_class FourierMLP --sigma 0.001 --fourier_dim 1024 --train_B --concatenate_fourier

The only thing that changes between the baseline is the number of hidden layers (we reduce by 1 to keep parameter count roughly the same), the network_class, the fourier_dim, sigma, train_B, and concatenate_fourier.

Image-space Soft Actor-Critic Experiments

Move to the image-SAC-LFF repository.

cd image-SAC-LFF

Install RAD dependencies:

conda env create -f conda_env.yml

To run an experiment, the template for CNN and CNN+LFF experiments, respectively, are:

python train.py --domain_name hopper --task_name hop --encoder_type fourier_pixel --action_repeat 4 \
                --num_eval_episodes 10 \--pre_transform_image_size 100 --image_size 84 --agent rad_sac \
                --frame_stack 3 --data_augs crop --critic_lr 1e-3 --actor_lr 1e-3 --eval_freq 10000 --batch_size 128 \
                --num_train_steps 1000000 --fourier_dim 128 --sigma 0.1 --train_B --concatenate_fourier
python train.py --domain_name hopper --task_name hop --encoder_type fair_pixel --action_repeat 4 \
                --num_eval_episodes 10 \--pre_transform_image_size 100 --image_size 84 --agent rad_sac \
                --frame_stack 3 --data_augs crop --critic_lr 1e-3 --actor_lr 1e-3 --eval_freq 10000 --batch_size 128 \
                --num_train_steps 1000000

Proximal Policy Optimization Experiments

Move to the state-PPO-LFF repository.

cd pytorch-a2c-ppo-acktr-gail

Install PPO dependencies:

conda env create -f environment.yml

To run an experiment, the template for MLP and LFF experiments, respectively, are:

python main.py --env-name Hopper-v2 --algo ppo --use-gae --log-interval 1 --num-steps 2048 --num-processes 1 \
               --lr 3e-4 --entropy-coef 0 --value-loss-coef 0.5 --ppo-epoch 10 --num-mini-batch 32 --gamma 0.99 \
               --gae-lambda 0.95 --num-env-steps 1000000 --use-linear-lr-decay --use-proper-time-limits \
               --hidden_dim 256 --network_class MLP --n_hidden 2 --seed 10
python main.py --env-name Hopper-v2 --algo ppo --use-gae --log-interval 1 --num-steps 2048 --num-processes 1 \
               --lr 3e-4 --entropy-coef 0 --value-loss-coef 0.5 --ppo-epoch 10 --num-mini-batch 32 --gamma 0.99 \
               --gae-lambda 0.95 --num-env-steps 1000000 --use-linear-lr-decay --use-proper-time-limits \
               --hidden_dim 256 --network_class FourierMLP --n_hidden 2 --sigma 0.01 --fourier_dim 64 \ 
               --concatenate_fourier --train_B --seed 10

Acknowledgements

We built the state-based SAC codebase off the TD3 repo by Fujimoto et al. We especially appreciated its lightweight bare-bones training loop. For the state-based SAC algorithm implementation and hyperparameters, we used this PyTorch SAC repo by Yarats and Kostrikov. For the SAC+RAD image-based experiments, we used the authors' implementation. Finally, we built off this PPO codebase by Ilya Kostrikov.

Owner
Alex Li
PhD student in machine learning at Carnegie Mellon University. Prev: undergrad at UC Berkeley.
Alex Li
GyroSPD: Vector-valued Distance and Gyrocalculus on the Space of Symmetric Positive Definite Matrices

GyroSPD Code for the paper "Vector-valued Distance and Gyrocalculus on the Space of Symmetric Positive Definite Matrices" accepted at NeurIPS 2021. Re

Federico Lopez 12 Dec 12, 2022
an implementation of Video Frame Interpolation via Adaptive Separable Convolution using PyTorch

This work has now been superseded by: https://github.com/sniklaus/revisiting-sepconv sepconv-slomo This is a reference implementation of Video Frame I

Simon Niklaus 985 Jan 08, 2023
An open-source online reverse dictionary.

An open-source online reverse dictionary.

THUNLP 6.3k Jan 09, 2023
Official repository for "Deep Recurrent Neural Network with Multi-scale Bi-directional Propagation for Video Deblurring".

RNN-MBP Deep Recurrent Neural Network with Multi-scale Bi-directional Propagation for Video Deblurring (AAAI-2022) by Chao Zhu, Hang Dong, Jinshan Pan

SIV-LAB 22 Aug 31, 2022
ML powered analytics engine for outlier detection and root cause analysis.

Website • Docs • Blog • LinkedIn • Community Slack ML powered analytics engine for outlier detection and root cause analysis ✨ What is Chaos Genius? C

Chaos Genius 523 Jan 04, 2023
Jiminy Cricket Environment (NeurIPS 2021)

Jiminy Cricket This is the repository for "What Would Jiminy Cricket Do? Towards Agents That Behave Morally" by Dan Hendrycks*, Mantas Mazeika*, Andy

Dan Hendrycks 15 Aug 29, 2022
Occlusion robust 3D face reconstruction model in CFR-GAN (WACV 2022)

Occlusion Robust 3D face Reconstruction Yeong-Joon Ju, Gun-Hee Lee, Jung-Ho Hong, and Seong-Whan Lee Code for Occlusion Robust 3D Face Reconstruction

Yeongjoon 31 Dec 19, 2022
Bayesian Inference Tools in Python

BayesPy Bayesian Inference Tools in Python Our goal is, given the discrete outcomes of events, estimate the distribution of categories. Using gradient

Max Sklar 99 Dec 14, 2022
🛰️ List of earth observation companies and job sites

Earth Observation Companies & Jobs source Portals & Jobs Geospatial Geospatial jobs newsletter: ~biweekly newsletter with geospatial jobs by Ali Ahmad

Dahn 64 Dec 27, 2022
An implementation demo of the ICLR 2021 paper Neural Attention Distillation: Erasing Backdoor Triggers from Deep Neural Networks in PyTorch.

Neural Attention Distillation This is an implementation demo of the ICLR 2021 paper Neural Attention Distillation: Erasing Backdoor Triggers from Deep

Yige-Li 84 Jan 04, 2023
Pytorch implementation of paper "Efficient Nearest Neighbor Language Models" (EMNLP 2021)

Pytorch implementation of paper "Efficient Nearest Neighbor Language Models" (EMNLP 2021)

Junxian He 57 Jan 01, 2023
Resilient projection-based consensus actor-critic (RPBCAC) algorithm

Resilient projection-based consensus actor-critic (RPBCAC) algorithm We implement the RPBCAC algorithm with nonlinear approximation from [1] and focus

Martin Figura 5 Jul 12, 2022
Code used for the results in the paper "ClassMix: Segmentation-Based Data Augmentation for Semi-Supervised Learning"

Code used for the results in the paper "ClassMix: Segmentation-Based Data Augmentation for Semi-Supervised Learning" Getting started Prerequisites CUD

70 Dec 02, 2022
A general-purpose encoder-decoder framework for Tensorflow

READ THE DOCUMENTATION CONTRIBUTING A general-purpose encoder-decoder framework for Tensorflow that can be used for Machine Translation, Text Summariz

Google 5.5k Jan 07, 2023
Learning 3D Part Assembly from a Single Image

Learning 3D Part Assembly from a Single Image This repository contains a PyTorch implementation of the paper: Learning 3D Part Assembly from A Single

18 Dec 21, 2022
Semi-SDP Semi-supervised parser for semantic dependency parsing.

Semi-SDP Semi-supervised parser for semantic dependency parsing. This repo contains the code used for the semi-supervised semantic dependency parser i

12 Sep 17, 2021
The source code for CATSETMAT: Cross Attention for Set Matching in Bipartite Hypergraphs

catsetmat The source code for CATSETMAT: Cross Attention for Set Matching in Bipartite Hypergraphs To be able to run it, add catsetmat to PYTHONPATH H

2 Dec 19, 2022
Official implementation of "Motif-based Graph Self-Supervised Learning forMolecular Property Prediction"

Motif-based Graph Self-Supervised Learning for Molecular Property Prediction Official Pytorch implementation of NeurIPS'21 paper "Motif-based Graph Se

zaixi 71 Dec 20, 2022
Official Pytorch implementation of the paper: "Locally Shifted Attention With Early Global Integration"

Locally-Shifted-Attention-With-Early-Global-Integration Pretrained models You can download all the models from here. Training Imagenet python -m torch

Shelly Sheynin 14 Apr 15, 2022
[CVPR'2020] DeepDeform: Learning Non-rigid RGB-D Reconstruction with Semi-supervised Data

DeepDeform (CVPR'2020) DeepDeform is an RGB-D video dataset containing over 390,000 RGB-D frames in 400 videos, with 5,533 optical and scene flow imag

Aljaz Bozic 165 Jan 09, 2023