NeurIPS 2021

Title: Distilling Robust and Non-Robust Features in Adversarial Examples by Information Bottleneck (paper)

Authors: Junho Kim, Byung-Kwan Lee, and Yong Man Ro (*: equally contributed)

Affiliation: School of Electric Engineering, Korea Advanced Institute of Science and Technology (KAIST)

Email: `[email protected]`, `[email protected]`, `[email protected]`

This is official PyTorch Implementation code for the paper of "Distilling Robust and Non-Robust Features in Adversarial Examples by Information Bottleneck" published in NeurIPS 21. It provides novel method of decomposing robust and non-robust features in intermediate layer. Further, we understand the semantic information of distilled features, by directly visualizing robust and non-robust features in the feature representation space. Consequently, we reveal that both of the robust and non-robust features indeed have semantic information in terms of human-perception by themselves. For more detail, you can refer to our paper!

Citation

If you find this work helpful, please cite it as:

@inproceedings{
kim2021distilling,
title={Distilling Robust and Non-Robust Features in Adversarial Examples by Information Bottleneck},
author={Junho Kim and Byung-Kwan Lee and Yong Man Ro},
booktitle={Advances in Neural Information Processing Systems},
editor={A. Beygelzimer and Y. Dauphin and P. Liang and J. Wortman Vaughan},
year={2021},
url={https://openreview.net/forum?id=90M-91IZ0JC}
}

Datasets

Baseline Models

VGG-16 (model/vgg.py)
WideResNet-28-10 (model/wideresnet.py)

Adversarial Attacks (by torchattacks)

Fast Gradient Sign Method (FGSM)
Basic Iterative Method (BIM)
Projected Gradient Descent (PGD)
Carlini & Wagner (CW)
AutoAttack (AA)
Fast Adaptive Boundary (FAB)

This implementation details are described in loader/loader.py.

    # Gradient Clamping based Attack
    if args.attack == "fgsm":
        return torchattacks.FGSM(model=net, eps=args.eps)

    elif args.attack == "bim":
        return torchattacks.BIM(model=net, eps=args.eps, alpha=1/255)

    elif args.attack == "pgd":
        return torchattacks.PGD(model=net, eps=args.eps,
                                alpha=args.eps/args.steps*2.3, steps=args.steps, random_start=True)

    elif args.attack == "cw":
        return torchattacks.CW(model=net, c=0.1, lr=0.1, steps=200)

    elif args.attack == "auto":
        return torchattacks.APGD(model=net, eps=args.eps)

    elif args.attack == "fab":
        return torchattacks.FAB(model=net, eps=args.eps, n_classes=args.n_classes)

Included Packages (for Ours)

Informative Feature Package (model/IFP.py)
- Distilling robust and non-robust features in intermediate layer by Information Bottleneck
Visualization of robust and non-robust features (visualization/inversion.py)
Non-Robust Feature (NRF) and Robust Feature (RF) Attack (model/IFP.py)
- NRF : maximizing the magnitude of non-robust feature gradients
- NRF2 : minimizing the magnitude of non-robust feature gradients
- RF : maximizing the magnitude of robust feature gradients
- RF2 : minimizing the magnitude of robust feature gradients

Baseline Methods

Plain (Plain Training)

Run train_plain.py

  parser.add_argument('--lr', default=0.01, type=float, help='learning rate')
  parser.add_argument('--dataset', default='cifar10', type=str, help='dataset name')
  parser.add_argument('--network', default='vgg', type=str, help='network name')
  parser.add_argument('--gpu_id', default='0', type=str, help='gpu id')
  parser.add_argument('--data_root', default='./datasets', type=str, help='path to dataset')
  parser.add_argument('--epoch', default=60, type=int, help='epoch number')
  parser.add_argument('--batch_size', default=100, type=int, help='Batch size')
  parser.add_argument('--pretrained', default='false', type=str2bool, help='pretrained boolean')
  parser.add_argument('--batchnorm', default='true', type=str2bool, help='batchnorm boolean')
  parser.add_argument('--save_dir', default='./experiment', type=str, help='save directory')

AT (PGD Adversarial Training)

Run train_AT.py

  parser.add_argument('--lr', default=0.01, type=float, help='learning rate')
  parser.add_argument('--steps', default=10, type=int, help='adv. steps')
  parser.add_argument('--eps', default=0.03, type=float, help='max norm')
  parser.add_argument('--dataset', default='cifar10', type=str, help='dataset name')
  parser.add_argument('--network', default='vgg', type=str, help='network name')
  parser.add_argument('--gpu_id', default='0', type=str, help='gpu id')
  parser.add_argument('--data_root', default='./datasets', type=str, help='path to dataset')
  parser.add_argument('--epoch', default=60, type=int, help='epoch number')
  parser.add_argument('--batch_size', default=100, type=int, help='Batch size')
  parser.add_argument('--attack', default='pgd', type=str, help='attack type')
  parser.add_argument('--pretrained', default='false', type=str2bool, help='pretrained boolean')
  parser.add_argument('--batchnorm', default='true', type=str2bool, help='batchnorm boolean')
  parser.add_argument('--save_dir', default='./experiment', type=str, help='save directory')

TRADES (Recent defense method)

Run train_TRADES.py

  parser.add_argument('--lr', default=0.01, type=float, help='learning rate')
  parser.add_argument('--steps', default=10, type=int, help='adv. steps')
  parser.add_argument('--eps', default=0.03, type=float, help='max norm')
  parser.add_argument('--dataset', default='cifar10', type=str, help='dataset name')
  parser.add_argument('--network', default='wide', type=str, help='network name: vgg or wide')
  parser.add_argument('--gpu_id', default='0', type=str, help='gpu id')
  parser.add_argument('--data_root', default='./datasets', type=str, help='path to dataset')
  parser.add_argument('--epoch', default=60, type=int, help='epoch number')
  parser.add_argument('--batch_size', default=100, type=int, help='Batch size')
  parser.add_argument('--attack', default='pgd', type=str, help='attack type')
  parser.add_argument('--pretrained', default='false', type=str2bool, help='pretrained boolean')
  parser.add_argument('--batchnorm', default='true', type=str2bool, help='batchnorm boolean')
  parser.add_argument('--save_dir', default='./experiment', type=str, help='save directory')

MART (Recent defense method)

Run train_MART.py

  parser.add_argument('--lr', default=0.01, type=float, help='learning rate')
  parser.add_argument('--steps', default=10, type=int, help='adv. steps')
  parser.add_argument('--eps', default=0.03, type=float, help='max norm')
  parser.add_argument('--dataset', default='cifar10', type=str, help='dataset name')
  parser.add_argument('--network', default='wide', type=str, help='network name')
  parser.add_argument('--gpu_id', default='0', type=str, help='gpu id')
  parser.add_argument('--data_root', default='./datasets', type=str, help='path to dataset')
  parser.add_argument('--epoch', default=60, type=int, help='epoch number')
  parser.add_argument('--batch_size', default=100, type=int, help='Batch size')
  parser.add_argument('--attack', default='pgd', type=str, help='attack type')
  parser.add_argument('--pretrained', default='false', type=str2bool, help='pretrained boolean')
  parser.add_argument('--batchnorm', default='true', type=str2bool, help='batchnorm boolean')
  parser.add_argument('--save_dir', default='./experiment', type=str, help='save directory')

Testing Model Robustness

Mearsuring the robustness in baseline models trained with baseline methods

Run test.py

parser.add_argument('--steps', default=10, type=int, help='adv. steps')
parser.add_argument('--eps', default=0.03, type=float, help='max norm')
parser.add_argument('--dataset', default='cifar10', type=str, help='dataset name')
parser.add_argument('--network', default='vgg', type=str, help='network name')
parser.add_argument('--data_root', default='./datasets', type=str, help='path to dataset')
parser.add_argument('--gpu_id', default='0', type=str, help='gpu id')
parser.add_argument('--save_dir', default='./experiment', type=str, help='save directory')
parser.add_argument('--batch_size', default=100, type=int, help='Batch size')
parser.add_argument('--pop_number', default=3, type=int, help='Batch size')
parser.add_argument('--datetime', default='00000000', type=str, help='checkpoint datetime')
parser.add_argument('--pretrained', default='false', type=str2bool, help='pretrained boolean')
parser.add_argument('--batchnorm', default='true', type=str2bool, help='batchnorm boolean')
parser.add_argument('--baseline', default='AT', type=str, help='baseline')

Visualizing Robust and Non-Robust Features

Feature Interpreation

Run visualize.py

parser.add_argument('--lr', default=0.01, type=float, help='learning rate')
parser.add_argument('--steps', default=10, type=int, help='adv. steps')
parser.add_argument('--eps', default=0.03, type=float, help='max norm')
parser.add_argument('--dataset', default='cifar10', type=str, help='dataset name')
parser.add_argument('--network', default='vgg', type=str, help='network name')
parser.add_argument('--gpu_id', default='0', type=str, help='gpu id')
parser.add_argument('--data_root', default='./datasets', type=str, help='path to dataset')
parser.add_argument('--epoch', default=0, type=int, help='epoch number')
parser.add_argument('--attack', default='pgd', type=str, help='attack type')
parser.add_argument('--save_dir', default='./experiment', type=str, help='save directory')
parser.add_argument('--batch_size', default=1, type=int, help='Batch size')
parser.add_argument('--pop_number', default=3, type=int, help='Batch size')
parser.add_argument('--prior', default='AT', type=str, help='Plain or AT')
parser.add_argument('--prior_datetime', default='00000000', type=str, help='checkpoint datetime')
parser.add_argument('--pretrained', default='false', type=str2bool, help='pretrained boolean')
parser.add_argument('--batchnorm', default='true', type=str2bool, help='batchnorm boolean')
parser.add_argument('--vis_atk', default='True', type=str2bool, help='is attacked image?')

Adversarial-Information-Bottleneck - Distilling Robust and Non-Robust Features in Adversarial Examples by Information Bottleneck (NeurIPS21)

Related tags

Overview

NeurIPS 2021

Title: Distilling Robust and Non-Robust Features in Adversarial Examples by Information Bottleneck (paper)

Authors: Junho Kim, Byung-Kwan Lee, and Yong Man Ro (*: equally contributed)

Affiliation: School of Electric Engineering, Korea Advanced Institute of Science and Technology (KAIST)

Email: `[email protected]`, `[email protected]`, `[email protected]`

Citation

Datasets

Baseline Models

Adversarial Attacks (by torchattacks)

Included Packages (for Ours)

Baseline Methods

Testing Model Robustness

Visualizing Robust and Non-Robust Features

Owner

LBK

PyBullet CartPole and Quadrotor environments—with CasADi symbolic a priori dynamics—for learning-based control and reinforcement learning

GPT, but made only out of gMLPs

Memory Defense: More Robust Classificationvia a Memory-Masking Autoencoder

An offline deep reinforcement learning library

Using machine learning to predict and analyze high and low reader engagement for New York Times articles posted to Facebook.

Multi-Anchor Active Domain Adaptation for Semantic Segmentation (ICCV 2021 Oral)

Hyperbolic Image Segmentation, CVPR 2022

Optimized primitives for collective multi-GPU communication

Density-aware Single Image De-raining using a Multi-stream Dense Network (CVPR 2018)

Immortal tracker

Implementation of "Bidirectional Projection Network for Cross Dimension Scene Understanding" CVPR 2021 (Oral)

Generic image compressor for machine learning. Pytorch code for our paper "Lossy compression for lossless prediction".

Toward Spatially Unbiased Generative Models (ICCV 2021)

AI Virtual Calculator: This is a simple virtual calculator based on Artificial intelligence.

Pytorch implementation of "M-LSD: Towards Light-weight and Real-time Line Segment Detection"

Points2Surf: Learning Implicit Surfaces from Point Clouds (ECCV 2020 Spotlight)

Unofficial TensorFlow implementation of Protein Interface Prediction using Graph Convolutional Networks.

[CVPR2022] Representation Compensation Networks for Continual Semantic Segmentation

Code for a seq2seq architecture with Bahdanau attention designed to map stereotactic EEG data from human brains to spectrograms, using the PyTorch Lightning.

Imaging, analysis, and simulation software for radio interferometry

Adversarial-Information-Bottleneck - Distilling Robust and Non-Robust Features in Adversarial Examples by Information Bottleneck (NeurIPS21)

Related tags

Overview

NeurIPS 2021

Title: Distilling Robust and Non-Robust Features in Adversarial Examples by Information Bottleneck (paper)

Authors: Junho Kim*, Byung-Kwan Lee*, and Yong Man Ro (*: equally contributed)

Affiliation: School of Electric Engineering, Korea Advanced Institute of Science and Technology (KAIST)

Email: [email protected], [email protected], [email protected]

Citation

Datasets

Baseline Models

Adversarial Attacks (by torchattacks)

Included Packages (for Ours)

Baseline Methods

Testing Model Robustness

Visualizing Robust and Non-Robust Features

Owner

LBK

PyBullet CartPole and Quadrotor environments—with CasADi symbolic a priori dynamics—for learning-based control and reinforcement learning

GPT, but made only out of gMLPs

Memory Defense: More Robust Classificationvia a Memory-Masking Autoencoder

An offline deep reinforcement learning library

Using machine learning to predict and analyze high and low reader engagement for New York Times articles posted to Facebook.

Multi-Anchor Active Domain Adaptation for Semantic Segmentation (ICCV 2021 Oral)

Hyperbolic Image Segmentation, CVPR 2022

Optimized primitives for collective multi-GPU communication

Density-aware Single Image De-raining using a Multi-stream Dense Network (CVPR 2018)

Immortal tracker

Implementation of "Bidirectional Projection Network for Cross Dimension Scene Understanding" CVPR 2021 (Oral)

Generic image compressor for machine learning. Pytorch code for our paper "Lossy compression for lossless prediction".

Toward Spatially Unbiased Generative Models (ICCV 2021)

AI Virtual Calculator: This is a simple virtual calculator based on Artificial intelligence.

Pytorch implementation of "M-LSD: Towards Light-weight and Real-time Line Segment Detection"

Points2Surf: Learning Implicit Surfaces from Point Clouds (ECCV 2020 Spotlight)

Unofficial TensorFlow implementation of Protein Interface Prediction using Graph Convolutional Networks.

[CVPR2022] Representation Compensation Networks for Continual Semantic Segmentation

Code for a seq2seq architecture with Bahdanau attention designed to map stereotactic EEG data from human brains to spectrograms, using the PyTorch Lightning.

Imaging, analysis, and simulation software for radio interferometry

Authors: Junho Kim, Byung-Kwan Lee, and Yong Man Ro (*: equally contributed)

Email: `[email protected]`, `[email protected]`, `[email protected]`