Code for paper Multitask-Finetuning of Zero-shot Vision-Language Models

Last update: Jul 15, 2022

Overview

Downloading our datasets

https://drive.google.com/file/d/1CfomsX6qmdCLfFutptqrQnp1RlaJEpXh/view?usp=sharing
extract and put the /data folder under the same root as /src

Dataset structure

Each dataset may have several subdatasets (most of them only have one)

|
   
   
    
    
    |dataset/
        -|
    
    
     
     
            -|
     
     
      
      
            -|
      
      
       
       
        -|
       
       
         ... |pickled/ -|tensor_dict.pt

The pickle file tensor_dict.pt has the following format:

{
    'subdataset_1':{
        'label_1':{
            'image_tensors':np.array((N,3,224,224)), # N: image number
            'input_ids':np.array(S), # S: token length of the filled template text
            'attention_masks':np.array(S),
            'template_input_ids':np.array(S_), # S_: token length of the un-filled template text
            'template_attention_masks':np.array(S_),
        },
        'label_2':{
            ...
        }
    },
    ...
}

ABO dataset contains an additional label_to_text.json file, which provides text template for each subdataset and label.

A list of available datasets and subdatasets

Dataset	dataset name (-i)	subdataset name (-d)
Clevr Counting	`ClevrCounting`	`counting`
Amazon Berkeley Objects (ABO)	`ABO`	`material`,`color`
Caltech-UCSD Birds 200 (CUB)	`CUB`	`classification`
Fungi	`Fungi`	`classification`
Mini-imagenet	`mini`	`classification`

Training with provided datasets

run.sh provided example code for performing training and meta-testing on our datasets.

Output format

Each model checkpoint dir contains two files:

step1.ckpt: model checkpoint after training phase
dev_test_results.json: scores on each task configuration on dev and test set during meta-testing

Loading checkpoint

Here is an example snippet for loading step1.ckpt from multitask-finetuning/classical-finetuning/zeroshot models:

/step1.ckpt")">

    model = MultitaskFinetuneCLIP()
    model = model.load_from_checkpoint(checkpoint_path="
    
    
     
     /step1.ckpt")

Here is an example snippet for loading step1.ckpt from fomaml models:

/step1.ckpt"))">

    model = LightningCLIP()
    model = l2l.algorithms.MAML(model, lr=1e-5 first_order=True)
    model.load_state_dict(torch.load("
    
    
     
     /step1.ckpt"))

Training with custom datasets

preprocess dataset

put your new dataset in the same format as provided dataset into data/
Specify template_function or the path to label_to_text json file (an example file can be found in /data/ABO/label_to_text.json) at line 350 and 355 in data.py
preprocess.sh provides an example of running data.py to create pickle file for your new dataset
add your dataset into construct_dataset(): line 77 in train.py and line 80 in train_MAML.py

train

modify run.sh to train and meta-test on your own dataset
refer to train.py and train_MAML.py for default and tuning hyperparameters for each algorithm

Code for paper Multitask-Finetuning of Zero-shot Vision-Language Models

Related tags

Overview

Downloading our datasets

Dataset structure

A list of available datasets and subdatasets

Training with provided datasets

Output format

Loading checkpoint

Training with custom datasets

preprocess dataset

train

Citation

Owner

Zhenhailong Wang

Python package for Turkish Language.

基于百度的语音识别，用python实现，pyaudio+pyqt

German Text-To-Speech Engine using Tacotron and Griffin-Lim

gaiic2021-track3-小布助手对话短文本语义匹配复赛rank3、决赛rank4

Chatbot with Pytorch, Python & Nextjs

Nested Named Entity Recognition

Flexible interface for high-performance research using SOTA Transformers leveraging Pytorch Lightning, Transformers, and Hydra.

Transformer training code for sequential tasks

Paradigm Shift in NLP - "Paradigm Shift in Natural Language Processing".

Smart discord chatbot integrated with Dialogflow to manage different classrooms and assist in teaching!

🐍💯pySBD (Python Sentence Boundary Disambiguation) is a rule-based sentence boundary detection that works out-of-the-box.

Sentello is python script that simulates the anti-evasion and anti-analysis techniques used by malware.

Simple python code to fix your combo list by removing any text after a separator or removing duplicate combos

Python wrapper for Stanford CoreNLP tools v3.4.1

Built for cleaning purposes in military institutions

NLP library designed for reproducible experimentation management

Official implementation of Meta-StyleSpeech and StyleSpeech

基于pytorch+bert的中文事件抽取

Code for text augmentation method leveraging large-scale language models

Ceaser-Cipher - The Caesar Cipher technique is one of the earliest and simplest method of encryption technique