A JAX implementation of Broaden Your Views for Self-Supervised Video Learning, or BraVe for short.

Related tags

Deep Learningbrave
Overview

BraVe

This is a JAX implementation of Broaden Your Views for Self-Supervised Video Learning, or BraVe for short.

The model provided in this package was implemented based on the internal model that was used to compute results for the accompanying paper. It achieves comparable results on the evaluation tasks when evaluated side-by-side. Not all details are guaranteed to be identical though, and some results may differ from those given in the paper. In particular, this implementation does not provide the option to train with optical flow.

We provide a selection of pretrained checkpoints in the table below, which can directly be evaluated against HMDB 51 with the evaluation tools this package. These are exactly the checkpoints that were used to provide the numbers in the accompanying paper, and were not trained with the exact trainer given in this package. For details on training a model with this package, please see the end of this readme.

In the table below, the different configurations are represented by using e.g. V/A for video (narrow view) to audio (broad view), or V/F for a narrow view containing video, and a broad view containing optical flow.

The backbone in each case is TSMResnet, with a given width multiplier (please see the accompanying paper for further details). For all of the given numbers below, the SVM regularization constant used is 0.0001. For HMDB 51, the average is given in brackets, followed by the top-1 percentages for each of the splits.

Views Architecture HMDB51 UCF-101 K600 Trained with this package Checkpoint
V/AF TSM (1X) (69.2%) 71.307%, 68.497%, 67.843% 92.9% 69.2% download
V/AF TSM (2X) (69.9%) 72.157%, 68.432%, 69.02% 93.2% 70.2% download
V/A TSM (1X) (69.4%) 70.131%, 68.889%, 69.085% 93.0% 70.6% download
V/VVV TSM (1X) (65.4%) 66.797%, 63.856%, 65.425% 92.6% 70.8% download

Reproducing results from the paper

This package provides everything needed to evaluate the above checkpoints against HMDB 51. It supports Python 3.7 and above.

To get started, we recommend using a clean virtualenv. You may then install the brave package directly from GitHub using,

pip install git+https://github.com/deepmind/brave.git

A pre-processed version of the HMDB 51 dataset can be downloaded using the following command. It requires that both ffmpeg and unrar are available. The following will download the dataset to /tmp/hmdb51/, but any other location would also work.

  python -m brave.download_hmdb --output_dir /tmp/hmdb51/

To evaluate a checkpoint downloaded from the above table, the following may be used. The dataset shards arguments should be set to match the paths used above.

  python -m brave.evaluate_video_embeddings \
    --checkpoint_path <path/to/downloaded/checkpoint>.npy \
    --train_dataset_shards '/tmp/hmdb51/split_1/train/*' \
    --test_dataset_shards '/tmp/hmdb51/split_1/test/*' \
    --svm_regularization 0.0001 \
    --batch_size 8

Note that any of the three splits can be evaluated by changing the dataset split paths. To run this efficiently using a GPU, it is also necessary to install the correct version of jaxlib. To install jaxlib with support for cuda 10.1 on linux, the following install should be sufficient, though other precompiled packages may be found through the JAX documentation.

  pip install https://storage.googleapis.com/jax-releases/cuda101/jaxlib-0.1.69+cuda101-cp39-none-manylinux2010_x86_64.whl

Depending on the available GPU memory available, the batch_size parameter may be tuned to obtain better performance, or to reduce the required GPU memory.

Training a network

This package may also be used to train a model from scratch using jaxline. In order to try this, first ensure the configuration is set appropriately by modifying brave/config.py. At minimum, it would also be necessary to choose an appropriate global batch size (by default, the setting of 512 is likely too large for any single-machine training setup). In addition, a value must be set for dataset_shards. This should contain the paths of the tfrecord files containing the serialized training data.

For details on checkpointing and distributing computation, see the jaxline documentation.

Similarly to above, it is necessary to install the correct jaxlib package to enable training on a GPU.

The training may now be launched using,

  python -m brave.experiment --config=brave/config.py

Training datasets

This model is able to read data stored in the format specified by DMVR. For an example of writing training data in the correct format see the code in dataset/fixtures.py, which is used to write the test fixtures used in the tests for this package.

Running the tests

After checking out this code locally, you may run the package tests using

  pip install -e .
  pytest brave

We recommend doing this from a clean virtual environment.

Citing this work

If you use this code (or any derived code), data or these models in your work, please cite the relevant accompanying paper.

@misc{recasens2021broaden,
      title={Broaden Your Views for Self-Supervised Video Learning},
      author={Adrià Recasens and Pauline Luc and Jean-Baptiste Alayrac and Luyu Wang and Ross Hemsley and Florian Strub and Corentin Tallec and Mateusz Malinowski and Viorica Patraucean and Florent Altché and Michal Valko and Jean-Bastien Grill and Aäron van den Oord and Andrew Zisserman},
      year={2021},
      eprint={2103.16559},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Disclaimer

This is not an official Google product

Owner
DeepMind
DeepMind
Denoising Diffusion Probabilistic Models

Denoising Diffusion Probabilistic Models This repo contains code for DDPM training. Based on Denoising Diffusion Probabilistic Models, Improved Denois

Alexander Markov 7 Dec 15, 2022
Source codes for Improved Few-Shot Visual Classification (CVPR 2020), Enhancing Few-Shot Image Classification with Unlabelled Examples

Source codes for Improved Few-Shot Visual Classification (CVPR 2020), Enhancing Few-Shot Image Classification with Unlabelled Examples (WACV 2022) and Beyond Simple Meta-Learning: Multi-Purpose Model

PLAI Group at UBC 42 Dec 06, 2022
PyTorch code for the NAACL 2021 paper "Improving Generation and Evaluation of Visual Stories via Semantic Consistency"

Improving Generation and Evaluation of Visual Stories via Semantic Consistency PyTorch code for the NAACL 2021 paper "Improving Generation and Evaluat

Adyasha Maharana 28 Dec 08, 2022
code for CVPR paper Zero-shot Instance Segmentation

Code for CVPR2021 paper Zero-shot Instance Segmentation Code requirements python: python3.7 nvidia GPU pytorch1.1.0 GCC =5.4 NCCL 2 the other python

zhengye 86 Dec 13, 2022
Code for the RA-L (ICRA) 2021 paper "SeqNet: Learning Descriptors for Sequence-Based Hierarchical Place Recognition"

SeqNet: Learning Descriptors for Sequence-Based Hierarchical Place Recognition [ArXiv+Supplementary] [IEEE Xplore RA-L 2021] [ICRA 2021 YouTube Video]

Sourav Garg 63 Dec 12, 2022
Official PyTorch implementation of the Fishr regularization for out-of-distribution generalization

Fishr: Invariant Gradient Variances for Out-of-distribution Generalization Official PyTorch implementation of the Fishr regularization for out-of-dist

62 Dec 22, 2022
Simple streamlit app to demonstrate HERE Tour Planning

Table of Contents About the Project Built With Getting Started Prerequisites Installation Usage Roadmap Contributing License Acknowledgements About Th

Amol 8 Sep 05, 2022
Quantify the difference between two arbitrary curves in space

similaritymeasures Quantify the difference between two arbitrary curves Curves in this case are: discretized by inidviudal data points ordered from a

Charles Jekel 175 Jan 08, 2023
This is the workbook I created while I was studying for the Qiskit Associate Developer exam. I hope this becomes useful to others as it was for me :)

A Workbook for the Qiskit Developer Certification Exam Hello everyone! This is Bartu, a fellow Qiskitter. I have recently taken the Certification exam

Bartu Bisgin 66 Dec 10, 2022
Source code for the Paper: CombOptNet: Fit the Right NP-Hard Problem by Learning Integer Programming Constraints}

CombOptNet: Fit the Right NP-Hard Problem by Learning Integer Programming Constraints Installation Run pipenv install (at your own risk with --skip-lo

Autonomous Learning Group 65 Dec 27, 2022
MetaBalance: Improving Multi-Task Recommendations via Adapting Gradient Magnitudes of Auxiliary Tasks

MetaBalance: Improving Multi-Task Recommendations via Adapting Gradient Magnitudes of Auxiliary Tasks Introduction This repo contains the pytorch impl

Meta Research 38 Oct 10, 2022
FB-tCNN for SSVEP Recognition

FB-tCNN for SSVEP Recognition Here are the codes of the tCNN and FB-tCNN in the paper "Filter Bank Convolutional Neural Network for Short Time-Window

Wenlong Ding 12 Dec 14, 2022
You Only Look Once for Panopitic Driving Perception

You Only 👀 Once for Panoptic 🚗 Perception You Only Look at Once for Panoptic driving Perception by Dong Wu, Manwen Liao, Weitian Zhang, Xinggang Wan

Hust Visual Learning Team 1.4k Jan 04, 2023
SynNet - synthetic tree generation using neural networks

SynNet This repo contains the code and analysis scripts for our amortized approach to synthetic tree generation using neural networks. Our model can s

Wenhao Gao 60 Dec 29, 2022
Rethinking Transformer-based Set Prediction for Object Detection

Rethinking Transformer-based Set Prediction for Object Detection Here are the code for the ICCV paper. The code is adapted from Detectron2 and AdelaiD

Zhiqing Sun 62 Dec 03, 2022
Simple reference implementation of GraphSAGE.

Reference PyTorch GraphSAGE Implementation Author: William L. Hamilton Basic reference PyTorch implementation of GraphSAGE. This reference implementat

William L Hamilton 861 Jan 06, 2023
Cockpit is a visual and statistical debugger specifically designed for deep learning.

Cockpit: A Practical Debugging Tool for Training Deep Neural Networks

Felix Dangel 421 Dec 29, 2022
The official project of SimSwap (ACM MM 2020)

SimSwap: An Efficient Framework For High Fidelity Face Swapping Proceedings of the 28th ACM International Conference on Multimedia The official reposi

Six_God 2.6k Jan 08, 2023
PyTorch Lightning + Hydra. A feature-rich template for rapid, scalable and reproducible ML experimentation with best practices. ⚡🔥⚡

Lightning-Hydra-Template A clean and scalable template to kickstart your deep learning project 🚀 ⚡ 🔥 Click on Use this template to initialize new re

Łukasz Zalewski 2.1k Jan 09, 2023
Code for "Modeling Indirect Illumination for Inverse Rendering", CVPR 2022

Modeling Indirect Illumination for Inverse Rendering Project Page | Paper | Data Preparation Set up the python environment conda create -n invrender p

ZJU3DV 116 Jan 03, 2023