SwinTransformerV2-TensorFlow

A TensorFlow implementation of SwinTransformerV2 by Microsoft Research Asia, based on their official implementation of SwinTransformerV1 and their paper on V2.

Paper on Version 2 (18/11/2021): [arXiv]

Paper on Version 1 (17/08/2021): [arXiv]

Features:

TensorFlow 2 implementation of version 1 and 2 of the SwinTransformer, a state-of-the-art backbone for many contemporaty tasks in computer vision. A brief overview of the architectural changes made in version 2:

A pre-norm configuration replaces the previous post-norm configuration, meant to improve training stability in larger models.
A scaled cosine attention replaces the dot product attention in V1, with a learnable scaler.
A continuous log-spaced relative position bias is used instead of the previous parametric table approach. This is implemented here as a small MLP network and a log transform on the relative coordinates bias.

Requirements:

numpy==1.21.4
tensorflow==2.7.0
tensorflow_addons==0.15.0

Getting started

Currently writing up.

License

This project is licensed under the MIT license.

Citation

@article{liu2021Swin,
  title={Swin Transformer: Hierarchical Vision Transformer using Shifted Windows},
  author={Liu, Ze and Lin, Yutong and Cao, Yue and Hu, Han and Wei, Yixuan and Zhang, Zheng and Lin, Stephen and Guo, Baining},
  journal={arXiv preprint arXiv:2103.14030},
  year={2021}
}

Implementation of SwinTransformerV2 in TensorFlow.

Related tags

Overview

SwinTransformerV2-TensorFlow

Features:

Requirements:

Getting started

License

Citation

Owner

Phan Nguyen

CLIP (Contrastive Language–Image Pre-training) for Italian

TLXZoo - Pre-trained models based on TensorLayerX

Robbing the FED: Directly Obtaining Private Data in Federated Learning with Modified Models

Data, notebooks, and articles associated with the RSNA AI Deep Learning Lab at RSNA 2021

Repository for the Bias Benchmark for QA dataset.

From Perceptron model to Deep Neural Network from scratch in Python.

Unofficial implement with paper SpeakerGAN: Speaker identification with conditional generative adversarial network

Code for One-shot Talking Face Generation from Single-speaker Audio-Visual Correlation Learning (AAAI 2022)

a curated list of docker-compose files prepared for testing data engineering tools, databases and open source libraries.

A transformer which can randomly augment VOC format dataset (both image and bbox) online.

Paddle-Skeleton-Based-Action-Recognition - DecoupleGCN-DropGraph, ASGCN, AGCN, STGCN

The Official PyTorch Implementation of "LSGM: Score-based Generative Modeling in Latent Space" (NeurIPS 2021)

PyTorch code for: Learning to Generate Grounded Visual Captions without Localization Supervision

EZ graph is an easy to use AI solution that allows you to make and train your neural networks without a single line of code.

A High-Performance Distributed Library for Large-Scale Bundle Adjustment

A pytorch-based real-time segmentation model for autonomous driving

Global Pooling, More than Meets the Eye: Position Information is Encoded Channel-Wise in CNNs, ICCV 2021

TensorFlow for Raspberry Pi

Toontown House CT Edition

Download & Install mods for your favorit game with a few simple clicks