ICLR 2021 i-Mix: A Domain-Agnostic Strategy for Contrastive Representation Learning

Last update: Nov 27, 2022

Related tags

Overview

Introduction

PyTorch code for the ICLR 2021 paper [i-Mix: A Domain-Agnostic Strategy for Contrastive Representation Learning].

@inproceedings{lee2021imix,
  title={i-Mix: A Domain-Agnostic Strategy for Contrastive Representation Learning},
  author={Lee, Kibok and Zhu, Yian and Sohn, Kihyuk and Li, Chun-Liang and Shin, Jinwoo and Lee, Honglak},
  booktitle={ICLR},
  year={2021}
}

Dependencies

python 3.7.4
numpy 1.17.2
pytorch 1.4.0
torchvision 0.5.0
cudatoolkit 10.1
librosa 0.8.0 for speech_commands
PIL 6.2.0 for GaussianBlur

Data

CIFAR-10/100 will automatically be downloaded.
For ImageNet, please refer to the [PyTorch ImageNet example]. The folder structure should be like data/imagenet/train/n01440764/
For speech commands, run bash speech_commands/download_speech_commands_dataset.sh.
For tabular datasets, download [covtype.data.gz] and [HIGGS.csv.gz], and place them in data/. They are processed when first loaded.

Running scripts

Please refer to [run.sh].

Plug-in example

For those who want to apply our method in their own code, we provide a minimal example based on [MoCo]:

# mixup: somewhere in main_moco.py
def mixup(input, alpha):
    beta = torch.distributions.beta.Beta(alpha, alpha)
    randind = torch.randperm(input.shape[0], device=input.device)
    lam = beta.sample([input.shape[0]]).to(device=input.device)
    lam = torch.max(lam, 1. - lam)
    lam_expanded = lam.view([-1] + [1]*(input.dim()-1))
    output = lam_expanded * input + (1. - lam_expanded) * input[randind]
    return output, randind, lam

# cutmix: somewhere in main_moco.py
def cutmix(input, alpha):
    beta = torch.distributions.beta.Beta(alpha, alpha)
    randind = torch.randperm(input.shape[0], device=input.device)
    lam = beta.sample().to(device=input.device)
    lam = torch.max(lam, 1. - lam)
    (bbx1, bby1, bbx2, bby2), lam = rand_bbox(input.shape[-2:], lam)
    output = input.clone()
    output[..., bbx1:bbx2, bby1:bby2] = output[randind][..., bbx1:bbx2, bby1:bby2]
    return output, randind, lam

def rand_bbox(size, lam):
    W, H = size
    cut_rat = (1. - lam).sqrt()
    cut_w = (W * cut_rat).to(torch.long)
    cut_h = (H * cut_rat).to(torch.long)

    cx = torch.zeros_like(cut_w, dtype=cut_w.dtype).random_(0, W)
    cy = torch.zeros_like(cut_h, dtype=cut_h.dtype).random_(0, H)

    bbx1 = (cx - cut_w // 2).clamp(0, W)
    bby1 = (cy - cut_h // 2).clamp(0, H)
    bbx2 = (cx + cut_w // 2).clamp(0, W)
    bby2 = (cy + cut_h // 2).clamp(0, H)

    new_lam = 1. - (bbx2 - bbx1).to(lam.dtype) * (bby2 - bby1).to(lam.dtype) / (W * H)

    return (bbx1, bby1, bbx2, bby2), new_lam

# https://github.com/facebookresearch/moco/blob/master/main_moco.py#L193
criterion = nn.CrossEntropyLoss(reduction='none').cuda(args.gpu)

# https://github.com/facebookresearch/moco/blob/master/main_moco.py#L302-L303
images[0], target_aux, lam = mixup(images[0], alpha=1.)
# images[0], target_aux, lam = cutmix(images[0], alpha=1.)
target = torch.arange(images[0].shape[0], dtype=torch.long).cuda()
output, _ = model(im_q=images[0], im_k=images[1])
loss = lam * criterion(output, target) + (1. - lam) * criterion(output, target_aux)

# https://github.com/facebookresearch/moco/blob/master/moco/builder.py#L142-L149
contrast = torch.cat([k, self.queue.clone().detach().t()], dim=0)
logits = torch.mm(q, contrast.t())

Note

builder.py is adapted from [MoCo] and [PyContrast].
main_*.py is adapted from [PyTorch ImageNet example] and [Mo Co].
models/resnet.py is adapted from [torchvision].
speech_commands/ is adapted from [this repo].

ICLR 2021 i-Mix: A Domain-Agnostic Strategy for Contrastive Representation Learning

Related tags

Overview

Introduction

Dependencies

Data

Running scripts

Plug-in example

Note

Owner

Kibok Lee

A Pytorch Implementation of Source Data-free Domain Adaptation for a Faster R-CNN

Contains a bunch of different python programm tasks

Basics of 2D and 3D Human Pose Estimation.

Segmentation for medical image.

Explainer for black box models that predict molecule properties

[ICML 2020] DrRepair: Learning to Repair Programs from Error Messages

TransFGU: A Top-down Approach to Fine-Grained Unsupervised Semantic Segmentation

Detectorch - detectron for PyTorch

Code for our WACV 2022 paper "Hyper-Convolution Networks for Biomedical Image Segmentation"

From this paper "SESNet: A Semantically Enhanced Siamese Network for Remote Sensing Change Detection"

A simplified framework and utilities for PyTorch

Transferable Unrestricted Attacks, which won 1st place in CVPR’21 Security AI Challenger: Unrestricted Adversarial Attacks on ImageNet.

CS5242_2021 - Neural Networks and Deep Learning, NUS CS5242, 2021

AdaShare: Learning What To Share For Efficient Deep Multi-Task Learning

StyleTransfer - Open source style transfer project, based on VGG19

Repository for the electrical and ICT benchmark model developed in the ERIGrid 2.0 project.

SparseML is a libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models

Mememoji - A facial expression classification system that recognizes 6 basic emotions: happy, sad, surprise, fear, anger and neutral.

PyTorch implementation of paper "StarEnhancer: Learning Real-Time and Style-Aware Image Enhancement" (ICCV 2021 Oral)

Replication Package for "An Empirical Study of the Effectiveness of an Ensemble of Stand-alone Sentiment Detection Tools for Software Engineering Datasets"