API for the GPT-J language model 🦜. Including a FastAPI backend and a streamlit frontend

Last update: Dec 31, 2022

Overview

gpt-j-api 🦜

An API to interact with the GPT-J language model. You can use and test the model in two different ways:

Streamlit web app at http://api.vicgalle.net:8000/
The proper API, documented at http://api.vicgalle.net:5000/docs

Using the API

Python:

import requests
context = "In a shocking finding, scientist discovered a herd of unicorns living in a remote, previously unexplored valley, in the Andes Mountains. Even more surprising to the researchers was the fact that the unicorns spoke perfect English."
payload = {
    "context": context,
    "token_max_length": 512,
    "temperature": 1.0,
    "top_p": 0.9,
}
response = requests.post("http://api.vicgalle.net:5000/generate", params=payload).json()
print(response)

Bash:

curl -X 'POST' \
  'http://api.vicgalle.net:5000/generate?context=In%20a%20shocking%20finding%2C%20scientists%20discovered%20a%20herd%20of%20unicorns%20living%20in%20a%20remote%2C%20previously%20unexplored%20valley%2C%20in%20the%20Andes%20Mountains.%20Even%20more%20surprising%20to%20the%20researchers%20was%20the%20fact%20that%20the%20unicorns%20spoke%20perfect%20English.&token_max_length=512&temperature=1&top_p=0.9' \
  -H 'accept: application/json' \
  -d ''

Deployment of the API server

Just ssh into a TPU VM. This code was only tested on the v3-8 variants.

First, install the requirements and get the weigts:

python3 -m pip install -r requirements.txt
wget https://the-eye.eu/public/AI/GPT-J-6B/step_383500_slim.tar.zstd
sudo apt install zstd
tar -I zstd -xf step_383500_slim.tar.zstd

And just run

python3 serve.py

Then, you can go to http://localhost:5000/docs and use the API!

Deploy the streamlit dashboard

Just run

python3 -m streamlit run streamlit_app.py --server.port 8000

Acknowledgements

Thanks to the support of the TPU Research Cloud, https://sites.research.google/trc/

Comments

I've made an extensions using this api

https://chrome.google.com/webstore/detail/type-j/femdhcgkiiagklmickakfoogeehbjnbh

You can check it out here

First i was very hyped up and it felt fun, like I was talking to a machine, but then I lost my enthusiasm and now I feel like it's totally useless xD

I'm just leaving a link here for you to appreciate you, it became real thanks for you posting this api

feel free to delete the issue as it's out of scope

if you got ideas on how to make it commercially succesful - i'll be happy to partner up

peace

opened by oogxdd 5

Illegal Instruction

When installing like described in the readme (fresh conda env,python=3.8, ubuntu) I'll get a illegal instruction immediately after running python serve.py

(gpt-j-api) […]@[…]:/opt/GPT/gpt-j-api$ python -q -X faulthandler serve.py
Fatal Python error: Illegal instruction

Current thread 0x00007f358d7861c0 (most recent call first):
  File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 1166 in create_module
  File "<frozen importlib._bootstrap>", line 556 in module_from_spec
  File "<frozen importlib._bootstrap>", line 657 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 975 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 991 in _find_and_load
  File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
  File "<frozen importlib._bootstrap>", line 1042 in _handle_fromlist
  File "/home/korny/miniconda3/envs/gpt-j-api/lib/python3.8/site-packages/jaxlib/xla_client.py", lin
e 31 in <module>
  File "<frozen importlib._bootstrap>", line 975 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 991 in _find_and_load
  File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
  File "<frozen importlib._bootstrap>", line 1042 in _handle_fromlist
  File "/home/korny/miniconda3/envs/gpt-j-api/lib/python3.8/site-packages/jax/lib/__init__.py", line 58 in <module>
  File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 843 in exec_module
  File "<frozen importlib._bootstrap>", line 671 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 975 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 991 in _find_and_load
  File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
  File "<frozen importlib._bootstrap>", line 1042 in _handle_fromlist
  File "/home/korny/miniconda3/envs/gpt-j-api/lib/python3.8/site-packages/jax/config.py", line 26 in <module>
  File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 843 in exec_module
  File "<frozen importlib._bootstrap>", line 671 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 975 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 991 in _find_and_load
  File "/home/korny/miniconda3/envs/gpt-j-api/lib/python3.8/site-packages/jax/__init__.py", line 33 in <module>
  File "<frozen importlib._bootstrap>", line 219 in _call_with_frames_removed
  File "<frozen importlib._bootstrap_external>", line 843 in exec_module
  File "<frozen importlib._bootstrap>", line 671 in _load_unlocked
  File "<frozen importlib._bootstrap>", line 975 in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 991 in _find_and_load
  File "serve.py", line 3 in <module>
Illegal instruction (core dumped)

EDIT

running this on CPU only, I tried installing jax[CPU] - same resut

opened by chris-aeviator 4

api seems offline

When I try to access the API I get the following error: ERR_CONNECTION_TIMED_OUT. But when I try to connect to it using a different IP address it does work. Am I IP banned?

opened by KoenTech 4
Usage

I'm using this to host a Discord chatbot, and though I have slowmode on the channel there's still a lot of usage, and often the API is being used as fast as it can generate completions. Will this harm the experience for others? Should I limit it more? (thanks for making this free but I don't want to take advantage of that too much if it's bad for others)

opened by Heath123 3
Alternative to Google TPU VM?

Hello,

I would like to run a local instance of GPT-J, but avoid using Google.

I have little to no experience in machine learning and its requirements, are there other solutions I could use? (What are the requirements for a machine in order to run GPT-J?)

Thank you very much!

opened by birkenbaum 3
Is there a way to speed up inference?

Hello, I am currently working on a project where I need quick inference. It needn't be real-time, but something around 7-10 sec would be great. Is there a way to speed up the inference using the API?

The model does not seem to be a problem as compute_time is around 8sec, but by the time the request arrives it takes around 20 seconds (over 30 on some occasions). Is there a way to make the request a bit faster?

Thanks,

opened by Aryagm 2
Errno 111

Could anyone please fix the following error? Thanks a lot.

"ConnectionError: ...Failed to establish a new connection: [Errno 111] Connection refused"

opened by Mather10 1
How to make the api public?

Hey, I was able to get serve.py running with the instructions you gave. But now I want to make the api public and connect it to a domain name so it can be publicly accessed (without needing a connection to the vm). How can I achieve this?

I want to do the same thing you did with "http://api.vicgalle.net:5000/generate" and "http://api.vicgalle.net:5000/docs".

Thanks,

opened by Aryagm 1
API VM?

Hi I wanted to host my own version of the api, where is the public one hosted? is it on a google cloud TPU VM? The ones ive seen here https://cloud.google.com/tpu/pricing are very expensive :D Is a TPU VM needed and the model won't be able to run on a normal GPU VM?

Thanks!

opened by jryebread 1
Raw text...

This is probably a very stupid question but whenever I run GPT-J I always get the full output:

{'model': 'GPT-J-6B', 'compute_time': 1.2492187023162842, 'text': ' \n(and you\'ll be a slave)\n\n**_"I\'m not a robot, I\'m a human being."_**\n\n**_"I\'m not a robot, I\'m a human being."_**\n\n', 'prompt': 'AI will take over the world ', 'token_max_length': 50, 'temperature': 0.09, 'top_p': 0.9, 'stop_sequence': None}

What parameter do I need to change so it only outputs the generated text?

(and you'll be a slave) I'm not a robot, I'm a human being. I'm not a robot, I'm a human being.

opened by Vilagamer999 1
Latency with TPU VM

Got things running on Google Clouds, really happy :). Was hoping for a little but of a speed increase, but computation time is the same and latency on the request seems to be the main delay. Did you experiment with firewalls and ports to improve things?

opened by Ontopic 1
Version support for Huggingface GPT-J 6B

GPT-J Huggingface and streamlit style like by project-code py

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("EleutherAI/gpt-j-6B")

model = AutoModelForCausalLM.from_pretrained("EleutherAI/gpt-j-6B")

opened by ghost 0

Releases(v0.3)

v0.3(Aug 8, 2021)
New feature:

New API endpoint: classify. Allows to use the model as zero-shot classification of texts. See the documentation

Source code(tar.gz)
Source code(zip)
v0.2.1(Jul 27, 2021)
New features:

Introduce a new argument stop_sequence in the API

Introduce tests, to ensure the API is live

Source code(tar.gz)
Source code(zip)
v0.2(Jul 8, 2021)

Now includes a streamlit web app that connects to the API
Source code(tar.gz)
Source code(zip)
v0.1(Jun 29, 2021)

Source code(tar.gz)
Source code(zip)

Owner

Víctor Gallego

Data scientist & predoc researcher

GitHub Repository http://api.vicgalle.net:8000/

This converter will create the exact measure for your cappuccino recipe from the grandiose Rafaella Ballerini!

About CappuccinoJs This converter will create the exact measure for your cappuccino recipe from the grandiose Rafaella Ballerini! Este conversor criar

48 Nov 15, 2022

Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System

Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System Authors: Yixuan Su, Lei Shu, Elman Mansimov, Arshit Gupta, Deng Cai, Yi-An Lai

124 Jan 03, 2023

Large-scale Knowledge Graph Construction with Prompting

Large-scale Knowledge Graph Construction with Prompting across tasks (predictive and generative), and modalities (language, image, vision + language, etc.)

161 Dec 28, 2022

🦅 Pretrained BigBird Model for Korean (up to 4096 tokens)

Pretrained BigBird Model for Korean What is BigBird • How to Use • Pretraining • Evaluation Result • Docs • Citation 한국어 | English What is BigBird? Bi

183 Dec 14, 2022

RoNER is a Named Entity Recognition model based on a pre-trained BERT transformer model trained on RONECv2

RoNER RoNER is a Named Entity Recognition model based on a pre-trained BERT transformer model trained on RONECv2. It is meant to be an easy to use, hi

9 Nov 07, 2022

Lyrics generation with GPT2-based Transformer

HuggingArtists - Train a model to generate lyrics Create AI-Artist in just 5 minutes! 🚀 Run the demo notebook to train 🚀 Run the GUI demo to test Di

65 Dec 19, 2022

File-based TF-IDF: Calculates keywords in a document, using a word corpus.

File-based TF-IDF Calculates keywords in a document, using a word corpus. Why? Because I found myself with hundreds of plain text files, with no way t

1 Feb 11, 2022

Transformer Based Korean Sentence Spacing Corrector

TKOrrector Transformer Based Korean Sentence Spacing Corrector License Summary This solution is made available under Apache 2 license. See the LICENSE

3 Apr 18, 2022

This project consists of data analysis and data visualization (done using python)of all IPL seasons from 2008 to 2019 and answering the most asked questions about the IPL.

IPL-data-analysis This project consists of data analysis and data visualization of all IPL seasons from 2008 to 2019 and answering the most asked ques

2 Feb 08, 2022

Training code of Spatial Time Memory Network. Semi-supervised video object segmentation.

Training-code-of-STM This repository fully reproduces Space-Time Memory Networks Performance on Davis17 val set&Weights backbone training stage traini

128 Dec 11, 2022

An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition

CRNN paper：An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition 1. create your ow

3 Apr 02, 2022

Global Rhythm Style Transfer Without Text Transcriptions

Global Prosody Style Transfer Without Text Transcriptions This repository provides a PyTorch implementation of AutoPST, which enables unsupervised glo

193 Dec 30, 2022

Simple Python library, distributed via binary wheels with few direct dependencies, for easily using wav2vec 2.0 models for speech recognition

Wav2Vec2 STT Python Beta Software Simple Python library, distributed via binary wheels with few direct dependencies, for easily using wav2vec 2.0 mode

22 Dec 29, 2022

API for the GPT-J language model 🦜. Including a FastAPI backend and a streamlit frontend

Related tags

Overview

gpt-j-api 🦜

Using the API

Deployment of the API server

Deploy the streamlit dashboard

Acknowledgements

Comments

EDIT

Releases(v0.3)

v0.3(Aug 8, 2021)

New feature:

v0.2.1(Jul 27, 2021)

v0.2(Jul 8, 2021)

v0.1(Jun 29, 2021)

Owner

Víctor Gallego

This converter will create the exact measure for your cappuccino recipe from the grandiose Rafaella Ballerini!

Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System

Large-scale Knowledge Graph Construction with Prompting

🦅 Pretrained BigBird Model for Korean (up to 4096 tokens)

RoNER is a Named Entity Recognition model based on a pre-trained BERT transformer model trained on RONECv2

Lyrics generation with GPT2-based Transformer

File-based TF-IDF: Calculates keywords in a document, using a word corpus.

Transformer Based Korean Sentence Spacing Corrector

This project consists of data analysis and data visualization (done using python)of all IPL seasons from 2008 to 2019 and answering the most asked questions about the IPL.

Training code of Spatial Time Memory Network. Semi-supervised video object segmentation.

An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition

Global Rhythm Style Transfer Without Text Transcriptions

Simple Python library, distributed via binary wheels with few direct dependencies, for easily using wav2vec 2.0 models for speech recognition

Code for the paper "VisualBERT: A Simple and Performant Baseline for Vision and Language"

Extract Keywords from sentence or Replace keywords in sentences.

Transformers Wav2Vec2 + Parlance's CTCDecodeTransformers Wav2Vec2 + Parlance's CTCDecode

FedNLP: A Benchmarking Framework for Federated Learning in Natural Language Processing

Text editor on python tkinter to convert english text to other languages with the help of ployglot.

Code for evaluating Japanese pretrained models provided by NTT Ltd.

PyABSA - Open & Efficient for Framework for Aspect-based Sentiment Analysis