Official codebase for Can Wikipedia Help Offline Reinforcement Learning?

Last update: Dec 19, 2022

Overview

Can Wikipedia Help Offline RL?

Machel Reid, Yutaro Yamada and Shixiang Shane Gu.

Our paper is up on arXiv.

Overview

Official codebase for Can Wikipedia Help Offline Reinforcement Learning?. Contains scripts to reproduce experiments. (This codebase is based on that of https://github.com/kzl/decision-transformer)

Instructions

We provide code our code directory containing code for our experiments.

Installation

Experiments require MuJoCo. Follow the instructions in the mujoco-py repo to install. Then, dependencies can be installed with the following command:

conda env create -f conda_env.yml

Downloading datasets

Datasets are stored in the data directory. LM co-training and vision experiments can be found in lm_cotraining and vision directories respectively. Install the D4RL repo, following the instructions there. Then, run the following script in order to download the datasets and save them in our format:

python download_d4rl_datasets.py

Downloading ChibiT

ChibiT can be downloaded with gdown as follows:

gdown --id $ID #we will add it soon!

Example usage

Experiments can be reproduced with the following:

python experiment.py --env hopper --dataset medium --model_type dt --pretrained_lm gpt2 \ # or path to chibiT
--gpt_kmeans --gpt_kmeans-const 0.1 
--

The run.sh file has example commands.

Adding -w True will log results to Weights and Biases.

Citation

Please cite our paper as:

@misc{reid2022wikipedia,
      title={Can Wikipedia Help Offline Reinforcement Learning?}, 
      author={Machel Reid and Yutaro Yamada and Shixiang Shane Gu},
      year={2022},
      eprint={2201.12122},
      archivePrefix={arXiv},
      primaryClass={cs.LG}
}

License

MIT

Official codebase for Can Wikipedia Help Offline Reinforcement Learning?

Related tags

Overview

Can Wikipedia Help Offline RL?

Overview

Instructions

Installation

Downloading datasets

Downloading ChibiT

Example usage

Citation

License

Owner

Machel Reid

Implemented shortest-circuit disambiguation, maximum probability disambiguation, HMM-based lexical annotation and BiLSTM+CRF-based named entity recognition

OCR을 이용하여 인원수를 인식 후 줌을 Kill 해줍니다

:P Some basic stuff I'm gonna use for my upcoming Agile Software Development and Devops

Creating a chess engine using GPT-3

Spacy-ginza-ner-webapi - Named Entity Recognition API with spaCy and GiNZA

Codes for processing meeting summarization datasets AMI and ICSI.

HiFi-GAN: Generative Adversarial Networks for Efficient and High Fidelity Speech Synthesis

It analyze the sentiment of the user, whether it is postive or negative.

Incorporating KenLM language model with HuggingFace implementation of Wav2Vec2CTC Model using beam search decoding

DeeBERT: Dynamic Early Exiting for Accelerating BERT Inference

Production First and Production Ready End-to-End Keyword Spotting Toolkit

a CTF web challenge about making screenshots

SimpleChinese2 集成了许多基本的中文NLP功能，使基于 Python 的中文文字处理和信息提取变得简单方便。

Translate U is capable of translating the text present in an image from one language to the other.

Use Google's BERT for named entity recognition （CoNLL-2003 as the dataset）.

Unsupervised intent recognition

🤕 spelling exceptions builder for lazy people

Write Alphabet, Words and Sentences with your eyes.

AutoGluon: AutoML for Text, Image, and Tabular Data

Adversarial Examples for Extreme Multilabel Text Classification