Finetune the base 64 px GLIDE-text2im model from OpenAI on your own image-text dataset

Last update: Oct 13, 2022

Related tags

Overview

glide-finetune

Finetune the base 64 px GLIDE-text2im model from OpenAI on your own image-text dataset.

Installation

git clone https://github.com/afiaka87/glide-finetune.git
cd glide-finetune/
python3 -m venv .venv # create a virtual environment to keep global install clean.
source .venv/bin/activate
(.venv) # optionally install pytorch manually for your own specific env first...
(.venv) python -m pip install -r requirements.txt

Usage

(.venv) python glide-finetune.py 
    --data_dir=./data \
    --batch_size=1 \
    --grad_acc=1 \
    --guidance_scale=4.0 \
    --learning_rate=2e-5 \
    --dropout=0.1 \
    --timestep_respacing=1000 \
    --side_x=64 \
    --side_y=64 \
    --resume_ckpt='' \
    --checkpoints_dir='./glide_checkpoints/' \
    --use_fp16 \
    --device='' \
    --freeze_transformer \
    --freeze_diffusion \
    --weight_decay=0.0 \
    --project_name='glide-finetune'

Known issues:

batching isn't handled in the dataloader
NaN/Inf errors
Resizing doesn't handle non-square aspect ratios properly
some of the code is messy, needs refactoring.

Comments

Fixed a couple of minor issues
Pinned webdataset version to work with python 3.7 which is the version being used in Colab, Kaggle. A new version of this module is releaed few days back which only works with 3.8/9

Fixed an issue with data_dir arg not getting picked up.
opened by vanga 1
Fix NameError when using --data_dir
Hello and thank you for your great work.

Right now using a local data folder with --data_dir results in

Traceback (most recent call last): File "/content/glide-finetune/train_glide.py", line 292, in <module> data_dir=data_dir, NameError: name 'data_dir' is not defined

This PR fixes that.
opened by tillfalko 0

mention mpi4py dependency

mpi4py installation will fail unless the user has this package installed. Since MPI is not a ubiquitous dependency it should probably be mentioned. Edit: Since torch==1.10.1 is a requirement, and torch versions come with their own cuda versions (torch 1.10.1 uses cuda 10.2), I don't see a reason not to just include bitsandbytes-cuda102 in requirements.txt.

$ py -m venv .venv
$ source .venv/bin/activate
$ pip install torch==1.10.1
Collecting torch==1.10.1
  Downloading torch-1.10.1-cp39-cp39-manylinux1_x86_64.whl (881.9 MB)
     |████████████████████████████████| 881.9 MB 15 kB/s
Collecting typing-extensions
  Downloading typing_extensions-4.0.1-py3-none-any.whl (22 kB)
Installing collected packages: typing-extensions, torch
Successfully installed torch-1.10.1 typing-extensions-4.0.1
$ py -c "import torch; print(torch.__version__)"
1.10.1+cu102

opened by tillfalko 0

Fixed half precision optimizer bug

Problem

In half precision, after the first iteration nan values start appearing regardless of input data or gradients since the adam optimizer breaks in float16. The discussion for that can be viewed here.

Solution

This can be fixed by setting the eps variable to 1e-4 instead of the default 1e-8. This is the only thing this pr does

opened by isamu-isozaki 0
Training on half precision leads to nan values

I was training my model and I noticed that after just the first iteration I was running into nan values. As it turns out my gradients and input values/images were all normal but the adam optimizer by pytorch does has some weird behavior on float16 precision where it produces nans probably because of a divide by 0 error. A discussion can be found below

https://discuss.pytorch.org/t/adam-half-precision-nans/1765/4

I hear changing the epison parameter for the adam weights parameter when on half precisions works but I haven't tested it yet. Will make one once I tested.

And also let me say thanks for this repo. I wanted to fine tune the glide model and this made it so much easier.

opened by isamu-isozaki 1
Where is the resume_ckpt

Hi, thanks for your job.

I noticed to finetune the glide, we should have a base_model, namely "resume_ckpt". --resume_ckpt 'ckpt_to_resume_from.pt'
Where can we get this model? Because I find Glide also didn't provide any checkpoint. Thanks for your help.

opened by zhaobingbingbing 0

Releases(v0.0.1)

v0.0.1(Feb 20, 2022)
Having some experience with finetuning GLIDE on laion/alamy, etc. I think this code works great now and hope as many people can use it as possible. Please file bugs - I know there may be a few.

New additions:

dataloader for LAION400M

dataloader for alamy

train the upsample model instead of just the base model

(early) code for training the released noisy CLIP. still a WIP.

Source code(tar.gz)
Source code(zip)

Owner

Clay Mullis

Software engineer working with multi-modal deep learning.

GitHub Repository

A Flow-based Generative Network for Speech Synthesis

WaveGlow: a Flow-based Generative Network for Speech Synthesis Ryan Prenger, Rafael Valle, and Bryan Catanzaro In our recent paper, we propose WaveGlo

2k Dec 26, 2022

DL & CV-based indicator toolset for the vehicle drivers via live dash-cam footage.

Vehicle Indicator Toolset Deep Learning and Computer Vision based indicator toolset for vehicle drivers using live dash-cam footages. Tracking of vehi

12 Dec 28, 2021

Pytorch code for semantic segmentation using ERFNet

ERFNet (PyTorch version) This code is a toolbox that uses PyTorch for training and evaluating the ERFNet architecture for semantic segmentation. For t

394 Jan 01, 2023

A complete end-to-end demonstration in which we collect training data in Unity and use that data to train a deep neural network to predict the pose of a cube. This model is then deployed in a simulated robotic pick-and-place task.

Object Pose Estimation Demo This tutorial will go through the steps necessary to perform pose estimation with a UR3 robotic arm in Unity. You’ll gain

187 Dec 24, 2022

Convolutional 2D Knowledge Graph Embeddings resources

ConvE Convolutional 2D Knowledge Graph Embeddings resources. Paper: Convolutional 2D Knowledge Graph Embeddings Used in the paper, but do not use thes

586 Dec 24, 2022

Deep learning library for solving differential equations and more

DeepXDE Voting on whether we should have a Slack channel for discussion. DeepXDE is a library for scientific machine learning. Use DeepXDE if you need

1.4k Dec 29, 2022

Towards Calibrated Model for Long-Tailed Visual Recognition from Prior Perspective

Towards Calibrated Model for Long-Tailed Visual Recognition from Prior Perspective Zhengzhuo Xu, Zenghao Chai, Chun Yuan This is the PyTorch implement

16 Dec 15, 2022

PyTorch/GPU re-implementation of the paper Masked Autoencoders Are Scalable Vision Learners

Masked Autoencoders: A PyTorch Implementation This is a PyTorch/GPU re-implementation of the paper Masked Autoencoders Are Scalable Vision Learners: @

4.8k Jan 04, 2023

Code and Datasets from the paper "Self-supervised contrastive learning for volcanic unrest detection from InSAR data"

Code and Datasets from the paper "Self-supervised contrastive learning for volcanic unrest detection from InSAR data" You can download the pretrained

3 May 07, 2022

Resources related to our paper "CLIN-X: pre-trained language models and a study on cross-task transfer for concept extraction in the clinical domain"

CLIN-X (CLIN-X-ES) & (CLIN-X-EN) This repository holds the companion code for the system reported in the paper: "CLIN-X: pre-trained language models a

4 Dec 05, 2022

Python script for Linear, Non-Linear Convection, Burger’s & Poisson Equation in 1D & 2D, 1D Diffusion Equation using Standard Wall Function, 2D Heat Conduction Convection equation with Dirichlet & Neumann BC, full Navier-Stokes Equation coupled with Poisson equation for Cavity and Channel flow in 2D using Finite Difference Method & Finite Volume Method.

Navier-Stokes-numerical-solution-using-Python- Python script for Linear, Non-Linear Convection, Burger’s & Poisson Equation in 1D & 2D, 1D D

89 Jan 04, 2023

Finetune the base 64 px GLIDE-text2im model from OpenAI on your own image-text dataset

Related tags

Overview

glide-finetune

Installation

Usage

Known issues:

Comments

Fixed a couple of minor issues

Fix NameError when using --data_dir

mention mpi4py dependency

Fixed half precision optimizer bug

Problem

Solution

Training on half precision leads to nan values

Where is the resume_ckpt

Releases(v0.0.1)

v0.0.1(Feb 20, 2022)

Owner

Clay Mullis

A Flow-based Generative Network for Speech Synthesis

DL & CV-based indicator toolset for the vehicle drivers via live dash-cam footage.

Pytorch code for semantic segmentation using ERFNet

A complete end-to-end demonstration in which we collect training data in Unity and use that data to train a deep neural network to predict the pose of a cube. This model is then deployed in a simulated robotic pick-and-place task.

Convolutional 2D Knowledge Graph Embeddings resources

Deep learning library for solving differential equations and more

Towards Calibrated Model for Long-Tailed Visual Recognition from Prior Perspective

PyTorch/GPU re-implementation of the paper Masked Autoencoders Are Scalable Vision Learners

Code and Datasets from the paper "Self-supervised contrastive learning for volcanic unrest detection from InSAR data"

Resources related to our paper "CLIN-X: pre-trained language models and a study on cross-task transfer for concept extraction in the clinical domain"

Official repo for QHack—the quantum machine learning hackathon

Multi-Scale Vision Longformer: A New Vision Transformer for High-Resolution Image Encoding

Official Python implementation of the FuzionCoin protocol

StyleGAN2-ADA - Official PyTorch implementation

Web-interface + rest API for classification and regression (https://jeff1evesque.github.io/machine-learning.docs)

Instance-wise Feature Importance in Time (FIT)

E2e music remastering system - End-to-end Music Remastering System Using Self-supervised and Adversarial Training

DIR-GNN - Discovering Invariant Rationales for Graph Neural Networks

Pytorch modules for paralel models with same architecture. Ideal for multi agent-based systems