Code for DetCon

This repository contains code for the ICCV 2021 paper "Efficient Visual Pretraining with Contrastive Detection" by Olivier J. Hénaff, Skanda Koppula, Jean-Baptiste Alayrac, Aaron van den Oord, Oriol Vinyals, João Carreira.

This repository includes sample code to run pretraining with DetCon. In particular, we're providing a sample script for generating the Felzenzwalb segmentations for ImageNet images (using skimage) and a pre-training experiment setup (dataloader, augmentation pipeline, optimization config, and loss definition) that describes the DetCon-B(YOL) model described in the paper. The original code uses a large grid of TPUs and internal infrastructure for training, but we've extracted the key DetCon loss+experiment in this folder so that external groups can have a reference should they want to explore a similar approaches.

This repository builds heavily from the BYOL open source release, so speed-up tricks and features in that setup may likely translate to the code here.

Running this code

Running ./setup.sh will create and activate a virtualenv and install all necessary dependencies. To enter the environment after running setup.sh, run source /tmp/detcon_venv/bin/activate.

Running bash test.sh will run a single training step on a mock image/Felzenszwalb mask as a simple validation that all dependencies are set up correctly and the DetCon pre-training can run smoothly. On our 16-core machine, running on CPU, we find this takes around 3-4 minutes.

A TFRecord dataset containing each ImageNet image, label, and its corresponding Felzenszwalb segmentation/mask can then be generated using the generate_fh_masks Python script. You will first have to download two pieces of ImageNet metadata into the same directory as the script:

wget https://raw.githubusercontent.com/tensorflow/models/master/research/slim/datasets/imagenet_metadata.txt wget https://raw.githubusercontent.com/tensorflow/models/master/research/slim/datasets/imagenet_lsvrc_2015_synsets.txt

And to run the multi-threaded mask generation script:

python generate_fh_masks_for_imagenet.py -- \
--train_directory=imagenet-train \
--output_directory=imagenet-train-fh

This single-machine, multi-threaded version of the mask generation script takes 2-3 days on a 16-core CPU machine to complete CPU-based processing of the ImageNet training and validation set. The script assumes the same ImageNet directory structure as github.com/tensorflow/models/blob/master/research/slim/datasets/build_imagenet_data.py (more details in the link).

You can then run the main training loop and execute multiple DetCon-B training steps by running from the parent directory the command:

python -m detcon.main_loop \
  --dataset_directory='/tmp/imagenet-fh-train' \
  --pretrain_epochs=100`

Note that you will need to update the dataset_directory flag, to point to the generated Felzenzwalb/image TFRecord dataset previously generated. Additionally, to use accelerators, users will need to install the correct version of jaxlib with CUDA support.

Citing this work

If you use this code in your work, please consider referencing our work:

@article{henaff2021efficient,
  title={{Efficient Visual Pretraining with Contrastive Detection}},
  author={H{\'e}naff, Olivier J and Koppula, Skanda and Alayrac, Jean-Baptiste and Oord, Aaron van den and Vinyals, Oriol and Carreira, Jo{\~a}o},
  journal={International Conference on Computer Vision},
  year={2021}
}

Disclaimer

This is not an officially supported Google product.

Code for Efficient Visual Pretraining with Contrastive Detection

Related tags

Overview

Code for DetCon

Running this code

Citing this work

Disclaimer

Owner

DeepMind

This repo is customed for VisDrone.

Companion code for the paper "An Infinite-Feature Extension for Bayesian ReLU Nets That Fixes Their Asymptotic Overconfidence" (NeurIPS 2021)

HyperSeg: Patch-wise Hypernetwork for Real-time Semantic Segmentation Official PyTorch Implementation

Code for KiloNeRF: Speeding up Neural Radiance Fields with Thousands of Tiny MLPs

Hyperparameter Optimization for TensorFlow, Keras and PyTorch

Code for "SRHEN: Stepwise-Refining Homography Estimation Network via Parsing Geometric Correspondences in Deep Latent Space"

Remote sensing change detection using PaddlePaddle

Reviatalizing Optimization for 3D Human Pose and Shape Estimation: A Sparse Constrained Formulation

Towards Rolling Shutter Correction and Deblurring in Dynamic Scenes (CVPR2021)

Pytorch Implementation of PointNet and PointNet++++

[ICCV 2021] Deep Hough Voting for Robust Global Registration

Expressive Power of Invariant and Equivaraint Graph Neural Networks (ICLR 2021)

Alpha-IoU: A Family of Power Intersection over Union Losses for Bounding Box Regression

Simple tutorials on Pytorch DDP training

Cross-media Structured Common Space for Multimedia Event Extraction (ACL2020)

This repository contains the code for the paper in EMNLP 2021: "HRKD: Hierarchical Relational Knowledge Distillation for Cross-domain Language Model Compression".

[CVPR2021] Domain Consensus Clustering for Universal Domain Adaptation

ParaGen is a PyTorch deep learning framework for parallel sequence generation

custom pytorch implementation of MoCo v3

Pytorch implementation of FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks