Code for the paper "Functional Regularization for Reinforcement Learning via Learned Fourier Features"

Last update: Nov 11, 2022

Related tags

Overview

Reinforcement Learning with Learned Fourier Features

State-space Soft Actor-Critic Experiments

Move to the state-SAC-LFF repository.

cd state-SAC-LFF

To install the dependencies, use the provided environment.yml file

conda env create -f environment.yml

To run an experiment, the template for MLP and LFF experiments, respectively, are:

python main.py --policy PytorchSAC --env dm.quadruped.run --start_timesteps 5000 --hidden_dim 1024 --batch_size 1024 --n_hidden 3
python main.py --policy PytorchSAC --env dm.quadruped.run --start_timesteps 5000 --hidden_dim 1024 --batch_size 1024 --n_hidden 2 \
               --network_class FourierMLP --sigma 0.001 --fourier_dim 1024 --train_B --concatenate_fourier

The only thing that changes between the baseline is the number of hidden layers (we reduce by 1 to keep parameter count roughly the same), the network_class, the fourier_dim, sigma, train_B, and concatenate_fourier.

Image-space Soft Actor-Critic Experiments

Move to the image-SAC-LFF repository.

cd image-SAC-LFF

Install RAD dependencies:

conda env create -f conda_env.yml

To run an experiment, the template for CNN and CNN+LFF experiments, respectively, are:

python train.py --domain_name hopper --task_name hop --encoder_type fourier_pixel --action_repeat 4 \
                --num_eval_episodes 10 \--pre_transform_image_size 100 --image_size 84 --agent rad_sac \
                --frame_stack 3 --data_augs crop --critic_lr 1e-3 --actor_lr 1e-3 --eval_freq 10000 --batch_size 128 \
                --num_train_steps 1000000 --fourier_dim 128 --sigma 0.1 --train_B --concatenate_fourier
python train.py --domain_name hopper --task_name hop --encoder_type fair_pixel --action_repeat 4 \
                --num_eval_episodes 10 \--pre_transform_image_size 100 --image_size 84 --agent rad_sac \
                --frame_stack 3 --data_augs crop --critic_lr 1e-3 --actor_lr 1e-3 --eval_freq 10000 --batch_size 128 \
                --num_train_steps 1000000

Proximal Policy Optimization Experiments

Move to the state-PPO-LFF repository.

cd pytorch-a2c-ppo-acktr-gail

Install PPO dependencies:

conda env create -f environment.yml

To run an experiment, the template for MLP and LFF experiments, respectively, are:

python main.py --env-name Hopper-v2 --algo ppo --use-gae --log-interval 1 --num-steps 2048 --num-processes 1 \
               --lr 3e-4 --entropy-coef 0 --value-loss-coef 0.5 --ppo-epoch 10 --num-mini-batch 32 --gamma 0.99 \
               --gae-lambda 0.95 --num-env-steps 1000000 --use-linear-lr-decay --use-proper-time-limits \
               --hidden_dim 256 --network_class MLP --n_hidden 2 --seed 10
python main.py --env-name Hopper-v2 --algo ppo --use-gae --log-interval 1 --num-steps 2048 --num-processes 1 \
               --lr 3e-4 --entropy-coef 0 --value-loss-coef 0.5 --ppo-epoch 10 --num-mini-batch 32 --gamma 0.99 \
               --gae-lambda 0.95 --num-env-steps 1000000 --use-linear-lr-decay --use-proper-time-limits \
               --hidden_dim 256 --network_class FourierMLP --n_hidden 2 --sigma 0.01 --fourier_dim 64 \ 
               --concatenate_fourier --train_B --seed 10

Acknowledgements

We built the state-based SAC codebase off the TD3 repo by Fujimoto et al. We especially appreciated its lightweight bare-bones training loop. For the state-based SAC algorithm implementation and hyperparameters, we used this PyTorch SAC repo by Yarats and Kostrikov. For the SAC+RAD image-based experiments, we used the authors' implementation. Finally, we built off this PPO codebase by Ilya Kostrikov.

Code for the paper "Functional Regularization for Reinforcement Learning via Learned Fourier Features"

Related tags

Overview

Reinforcement Learning with Learned Fourier Features

State-space Soft Actor-Critic Experiments

Image-space Soft Actor-Critic Experiments

Proximal Policy Optimization Experiments

Acknowledgements

Owner

Alex Li

GyroSPD: Vector-valued Distance and Gyrocalculus on the Space of Symmetric Positive Definite Matrices

an implementation of Video Frame Interpolation via Adaptive Separable Convolution using PyTorch

An open-source online reverse dictionary.

Official repository for "Deep Recurrent Neural Network with Multi-scale Bi-directional Propagation for Video Deblurring".

ML powered analytics engine for outlier detection and root cause analysis.

Jiminy Cricket Environment (NeurIPS 2021)

Occlusion robust 3D face reconstruction model in CFR-GAN (WACV 2022)

Bayesian Inference Tools in Python

🛰️ List of earth observation companies and job sites

An implementation demo of the ICLR 2021 paper Neural Attention Distillation: Erasing Backdoor Triggers from Deep Neural Networks in PyTorch.

Pytorch implementation of paper "Efficient Nearest Neighbor Language Models" (EMNLP 2021)

Resilient projection-based consensus actor-critic (RPBCAC) algorithm

Code used for the results in the paper "ClassMix: Segmentation-Based Data Augmentation for Semi-Supervised Learning"

A general-purpose encoder-decoder framework for Tensorflow

Learning 3D Part Assembly from a Single Image

Semi-SDP Semi-supervised parser for semantic dependency parsing.

The source code for CATSETMAT: Cross Attention for Set Matching in Bipartite Hypergraphs

Official implementation of "Motif-based Graph Self-Supervised Learning forMolecular Property Prediction"

Official Pytorch implementation of the paper: "Locally Shifted Attention With Early Global Integration"

[CVPR'2020] DeepDeform: Learning Non-rigid RGB-D Reconstruction with Semi-supervised Data