Certified Patch Robustness via Smoothed Vision Transformers

Overview

Certified Patch Robustness via Smoothed Vision Transformers

This repository contains the code for replicating the results of our paper:

Certified Patch Robustness via Smoothed Vision Transformers
Hadi Salman*, Saachi Jain*, Eric Wong*, Aleksander Madry

Paper
Blog post Part I.
Blog post Part II.

    @article{salman2021certified,
        title={Certified Patch Robustness via Smoothed Vision Transformers},
        author={Hadi Salman and Saachi Jain and Eric Wong and Aleksander Madry},
        booktitle={ArXiv preprint arXiv:2110.07719},
        year={2021}
    }

Getting started

Our code relies on the MadryLab public robustness library, which will be automatically installed when you follow the instructions below.

  1. Clone our repo: git clone https://github.mit.edu/hady/smoothed-vit

  2. Install dependencies:

    conda create -n smoothvit python=3.8
    conda activate smoothvit
    pip install -r requirements.txt
    

Full pipeline for building smoothed ViTs.

Now, we will walk you through the steps to create a smoothed ViT on the CIFAR-10 dataset. Similar steps can be followed for other datasets.

The entry point of our code is main.py (see the file for a full description of arguments).

First we will train the base classifier with ablations as data augmentation. Then we will apply derandomizd smoothing to build a smoothed version of the model which is certifiably robust.

Training the base classifier

The first step is to train the base classifier (here a ViT-Tiny) with ablations.

python src/main.py \
      --dataset cifar10 \
      --data /tmp \
      --arch deit_tiny_patch16_224 \
      --pytorch-pretrained \
      --out-dir OUTDIR \
      --exp-name demo \
      --epochs 30 \
      --lr 0.01 \
      --step-lr 10 \
      --batch-size 128 \
      --weight-decay 5e-4 \
      --adv-train 0 \
      --freeze-level -1 \
      --drop-tokens \
      --cifar-preprocess-type simple224 \
      --ablate-input \
      --ablation-type col \
      --ablation-size 4

Once training is done, the mode is saved in OUTDIR/demo/.

Certifying the smoothed classifier

Now we are ready to apply derandomized smoothing to obtain certificates for each datapoint against adversarial patches. To do so, simply run:

python src/main.py \
      --dataset cifar10 \
      --data /tmp \
      --arch deit_tiny_patch16_224 \
      --out-dir OUTDIR \
      --exp-name demo \
      --batch-size 128 \
      --adv-train 0 \
      --freeze-level -1 \
      --drop-tokens \
      --cifar-preprocess-type simple224 \
      --resume \
      --eval-only 1 \
      --certify \
      --certify-out-dir OUTDIR_CERT \
      --certify-mode col \
      --certify-ablation-size 4 \
      --certify-patch-size 5

This will calculate the standard and certified accuracies of the smoothed model. The results will be dumped into OUTDIR_CERT/demo/.

That's it! Now you can replicate all the results of our paper.

Download our ImageNet models

If you find our pretrained models useful, please consider citing our work.

Models trained with column ablations

Model Ablation Size = 19
ResNet-18 LINK
ResNet-50 LINK
WRN-101-2 LINK
ViT-T LINK
ViT-S LINK
ViT-B LINK

We have uploaded the most important models. If you need any other model (for the sweeps for example) please let us know and we are happy to provide!

Maintainers

Owner
Madry Lab
Towards a Principled Science of Deep Learning
Madry Lab
[ICCV2021] Official Pytorch implementation for SDGZSL (Semantics Disentangling for Generalized Zero-Shot Learning)

Semantics Disentangling for Generalized Zero-shot Learning This is the official implementation for paper Zhi Chen, Yadan Luo, Ruihong Qiu, Zi Huang, J

25 Dec 06, 2022
[IROS2021] NYU-VPR: Long-Term Visual Place Recognition Benchmark with View Direction and Data Anonymization Influences

NYU-VPR This repository provides the experiment code for the paper Long-Term Visual Place Recognition Benchmark with View Direction and Data Anonymiza

Automation and Intelligence for Civil Engineering (AI4CE) Lab @ NYU 22 Sep 28, 2022
Learn about Spice.ai with in-depth samples

Samples Learn about Spice.ai with in-depth samples ServerOps - Learn when to run server maintainance during periods of low load Gardener - Intelligent

Spice.ai 16 Mar 23, 2022
Image Captioning using CNN and Transformers

Image-Captioning Keras/Tensorflow Image Captioning application using CNN and Transformer as encoder/decoder. In particulary, the architecture consists

24 Dec 28, 2022
Official implementation of the paper Do pedestrians pay attention? Eye contact detection for autonomous driving

Do pedestrians pay attention? Eye contact detection for autonomous driving Official implementation of the paper Do pedestrians pay attention? Eye cont

VITA lab at EPFL 26 Nov 02, 2022
[NeurIPS'21 Spotlight] PyTorch code for our paper "Aligned Structured Sparsity Learning for Efficient Image Super-Resolution"

ASSL This repository is for a new network pruning method (Aligned Structured Sparsity Learning, ASSL) for efficient single image super-resolution (SR)

Huan Wang 47 Nov 28, 2022
CCPD: a diverse and well-annotated dataset for license plate detection and recognition

CCPD (Chinese City Parking Dataset, ECCV) UPdate on 10/03/2019. CCPD Dataset is now updated. We are confident that images in subsets of CCPD is much m

detectRecog 1.8k Dec 30, 2022
Efficient neural networks for analog audio effect modeling

micro-TCN Efficient neural networks for audio effect modeling

Christian Steinmetz 94 Dec 29, 2022
Large scale PTM - PPI relation extraction

Large-scale protein-protein post-translational modification extraction with distant supervision and confidence calibrated BioBERT The silver standard

1 Feb 25, 2022
disentanglement_lib is an open-source library for research on learning disentangled representations.

disentanglement_lib disentanglement_lib is an open-source library for research on learning disentangled representation. It supports a variety of diffe

Google Research 1.3k Dec 28, 2022
System-oriented IR evaluations are limited to rather abstract understandings of real user behavior

Validating Simulations of User Query Variants This repository contains the scripts of the experiments and evaluations, simulated queries, as well as t

IR Group at Technische Hochschule Köln 2 Nov 23, 2022
Boundary-aware Transformers for Skin Lesion Segmentation

Boundary-aware Transformers for Skin Lesion Segmentation Introduction This is an official release of the paper Boundary-aware Transformers for Skin Le

Jiacheng Wang 79 Dec 16, 2022
Information Gain Filtration (IGF) is a method for filtering domain-specific data during language model finetuning. IGF shows significant improvements over baseline fine-tuning without data filtration.

Information Gain Filtration Information Gain Filtration (IGF) is a method for filtering domain-specific data during language model finetuning. IGF sho

4 Jul 28, 2022
Few-Shot Graph Learning for Molecular Property Prediction

Few-shot Graph Learning for Molecular Property Prediction Introduction This is the source code and dataset for the following paper: Few-shot Graph Lea

Zhichun Guo 94 Dec 12, 2022
Interactive web apps created using geemap and streamlit

geemap-apps Introduction This repo demostrates how to build a multi-page Earth Engine App using streamlit and geemap. You can deploy the app on variou

Qiusheng Wu 27 Dec 23, 2022
This is the implementation of our work Deep Extreme Cut (DEXTR), for object segmentation from extreme points.

This is the implementation of our work Deep Extreme Cut (DEXTR), for object segmentation from extreme points.

Sergi Caelles 828 Jan 05, 2023
Python TFLite scripts for detecting objects of any class in an image without knowing their label.

Python TFLite scripts for detecting objects of any class in an image without knowing their label.

Ibai Gorordo 42 Oct 07, 2022
Bunch of different tools which helps visualizing and annotating images for semantic/instance segmentation tasks

Data Framework for Semantic/Instance Segmentation Bunch of different tools which helps visualizing, transforming and annotating images for semantic/in

Bruno Fernandes Carvalho 5 Dec 21, 2022
PFFDTD is an open-source FDTD simulator for 3D room acoustics

PFFDTD is an open-source FDTD simulator for 3D room acoustics

Brian Hamilton 34 Nov 24, 2022
Localized representation learning from Vision and Text (LoVT)

Localized Vision-Text Pre-Training Contrastive learning has proven effective for pre- training image models on unlabeled data and achieved great resul

Philip Müller 10 Dec 07, 2022