Dynamic Slimmable Network (CVPR 2021, Oral)

Overview

Dynamic Slimmable Network (DS-Net)

This repository contains PyTorch code of our paper: Dynamic Slimmable Network (CVPR 2021 Oral).

image

Architecture of DS-Net. The width of each supernet stage is adjusted adaptively by the slimming ratio ρ predicted by the gate.

image

Accuracy vs. complexity on ImageNet.

Usage

1. Requirements

2. Stage I: Supernet Training

For example, train dynamic slimmable MobileNet supernet with 8 GPUs (takes about 2 days):

python -m torch.distributed.launch --nproc_per_node=8 train.py /PATH/TO/ImageNet -c ./configs/mobilenetv1_bn_uniform.yml

3. Stage II: Gate Training

  • Will be available soon

Citation

If you use our code for your paper, please cite:

@inproceedings{li2021dynamic,
  author = {Changlin Li and
            Guangrun Wang and
            Bing Wang and
            Xiaodan Liang and
            Zhihui Li and
            Xiaojun Chang},
  title = {Dynamic Slimmable Network},
  booktitle = {CVPR},
  year = {2021}
}
Comments
  • The usage of gumbel softmax in DS-Net

    The usage of gumbel softmax in DS-Net

    Thank you for your very nice work,I want to know that the effect of gumble softmax,because I think the network can be trained without gumble softmax. Is the gumbel softmax just aimed to increase the randomness of channel choice?

    discussion 
    opened by LinyeLi60 7
  • UserWarning: Argument interpolation should be of type InterpolationMode instead of int. Please, use InterpolationMode enum.

    UserWarning: Argument interpolation should be of type InterpolationMode instead of int. Please, use InterpolationMode enum.

    Why I get an warning: /home/chauncey/.local/lib/python3.8/site-packages/torchvision/transforms/functional.py:364: UserWarning: Argument interpolation should be of type InterpolationMode instead of int. Please, use InterpolationMode enum. warnings.warn( when I use python3 -m torch.distributed.launch --nproc_per_node=1 train.py ./imagenet -c ./configs/mobilenetv1_bn_uniform.yml

    opened by Chauncey-Wang 3
  • Question about calculating MAdds of dynamic network in the paper

    Question about calculating MAdds of dynamic network in the paper

    Thank you for your great work, and I have a question about how to calculate MAdds in your paper. The dynamic network has different widths and MAdds for each instance, but you denoted MAdds for your networks. Are they the average MAdds for the whole dataset?

    discussion 
    opened by sseung0703 3
  • why not set ensemble_ib to True?

    why not set ensemble_ib to True?

    Hi,

    I found that ensemble_ib is set to False for both slim training and gate training from the configs, but from paper it would boost the performance when set toTrue.

    Any idea?

    opened by twmht 2
  • MAdds of Pretrained Supernet

    MAdds of Pretrained Supernet

    Hi Changlin, your work is excellent. I have a question about the calculation of MAdds, in README.md the MAdds of Subnetwork 13 is 565M, but I think the MAdds of Subnetwork 13 should be 821M observed in my experiments, because the channel number of Subnetwork 13 is larger than the original MobileNetV1, and the original MobileNetV1 1.0's MAdds should be 565M. Looking forward to your reply.

    opened by LinyeLi60 2
  • Error of change the num_choice in mobilenetv1_bn_uniform_reset_bn.yml

    Error of change the num_choice in mobilenetv1_bn_uniform_reset_bn.yml

    I follow your suggestion to set the num_choice in mobilenetv1_bn_uniform_reset_bn.yml to 14, but get an expected error when I use python -m torch.distributed.launch --nproc_per_node=8 train.py /PATH/TO/ImageNet -c ./configs/mobilenetv1_bn_uniform_reset_bn.yml.

    08/25 10:15:57 AM Recalibrating BatchNorm statistics... 08/25 10:16:10 AM Finish recalibrating BatchNorm statistics. 08/25 10:16:19 AM Finish recalibrating BatchNorm statistics. 08/25 10:16:21 AM Test: [ 0/0] Mode: 0 Time: 0.344 (0.344) Loss: 6.9204 (6.9204) [email protected]: 0.0000 ( 0.0000) [email protected]: 0.0000 ( 0.0000) Flops: 132890408 (132890408) 08/25 10:16:22 AM Test: [ 0/0] Mode: 1 Time: 0.406 (0.406) Loss: 6.9189 (6.9189) [email protected]: 0.0000 ( 0.0000) [email protected]: 0.0000 ( 0.0000) Flops: 152917440 (152917440) 08/25 10:16:22 AM Test: [ 0/0] Mode: 2 Time: 0.381 (0.381) Loss: 6.9187 (6.9187) [email protected]: 0.0000 ( 0.0000) [email protected]: 0.0000 ( 0.0000) Flops: 175152224 (175152224) 08/25 10:16:23 AM Test: [ 0/0] Mode: 3 Time: 0.389 (0.389) Loss: 6.9134 (6.9134) [email protected]: 0.0000 ( 0.0000) [email protected]: 0.0000 ( 0.0000) Flops: 199594752 (199594752) Traceback (most recent call last): File "train.py", line 658, in main() File "train.py", line 635, in main eval_metrics.append(validate_slim(model, File "/home/chauncey/PycharmProjects/DS-Net-main/dyn_slim/apis/train_slim.py", line 215, in validate_slim output = model(input) File "/home/chauncey/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/chauncey/PycharmProjects/DS-Net-main/dyn_slim/models/dyn_slim_net.py", line 191, in forward x = self.forward_features(x) File "/home/chauncey/PycharmProjects/DS-Net-main/dyn_slim/models/dyn_slim_net.py", line 178, in forward_features x = stage(x) File "/home/chauncey/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/chauncey/PycharmProjects/DS-Net-main/dyn_slim/models/dyn_slim_stages.py", line 48, in forward x = self.first_block(x) File "/home/chauncey/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/chauncey/PycharmProjects/DS-Net-main/dyn_slim/models/dyn_slim_blocks.py", line 240, in forward x = self.conv_pw(x) File "/home/chauncey/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/home/chauncey/PycharmProjects/DS-Net-main/dyn_slim/models/dyn_slim_ops.py", line 94, in forward self.running_outc = self.out_channels_list[self.channel_choice] IndexError: list index out of range

    It looks like we should make some adjustment in other py files.

    opened by chaunceywx 2
  • Why the num_choice in different yml is different?

    Why the num_choice in different yml is different?

    Why you set num_choice in mobilenetv1_bn_uniform_reset_bn.yml as 4, but set this parameter as 14 in the other two yml file?

    老哥,如果你也是中国人,咱们还是用中文交流吧,我英语水平比较感人。。。

    opened by chaunceywx 2
  • 运行问题

    运行问题

    请问大佬下面这个问题是为什么 Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.


    /root/anaconda3/envs/0108/lib/python3.6/site-packages/torchvision/io/image.py:11: UserWarning: Failed to load image Python extension: /root/anaconda3/envs/0108/lib/python3.6/site-packages/torchvision/image.so: undefined symbol: _ZNK3c106IValue23reportToTensorTypeErrorEv warn(f"Failed to load image Python extension: {e}") /root/anaconda3/envs/0108/lib/python3.6/site-packages/torchvision/io/image.py:11: UserWarning: Failed to load image Python extension: /root/anaconda3/envs/0108/lib/python3.6/site-packages/torchvision/image.so: undefined symbol: _ZNK3c106IValue23reportToTensorTypeErrorEv warn(f"Failed to load image Python extension: {e}") 01/21 05:42:18 AM Added key: store_based_barrier_key:1 to store for rank: 1 01/21 05:42:18 AM Added key: store_based_barrier_key:1 to store for rank: 0 01/21 05:42:18 AM Training in distributed mode with multiple processes, 1 GPU per process. Process 0, total 2. 01/21 05:42:18 AM Training in distributed mode with multiple processes, 1 GPU per process. Process 1, total 2. 01/21 05:42:20 AM Model slimmable_mbnet_v1_bn_uniform created, param count: 7676204 01/21 05:42:20 AM Data processing configuration for current model + dataset: 01/21 05:42:20 AM input_size: (3, 224, 224) 01/21 05:42:20 AM interpolation: bicubic 01/21 05:42:20 AM mean: (0.485, 0.456, 0.406) 01/21 05:42:20 AM std: (0.229, 0.224, 0.225) 01/21 05:42:20 AM crop_pct: 0.875 01/21 05:42:20 AM NVIDIA APEX not installed. AMP off. 01/21 05:42:21 AM Using torch DistributedDataParallel. Install NVIDIA Apex for Apex DDP. 01/21 05:42:21 AM Scheduled epochs: 40 01/21 05:42:21 AM Training folder does not exist at: images/train 01/21 05:42:21 AM Training folder does not exist at: images/train Killing subprocess 239 Killing subprocess 240 Traceback (most recent call last): File "/root/anaconda3/envs/0108/lib/python3.6/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/root/anaconda3/envs/0108/lib/python3.6/runpy.py", line 85, in _run_code exec(code, run_globals) File "/root/anaconda3/envs/0108/lib/python3.6/site-packages/torch/distributed/launch.py", line 340, in main() File "/root/anaconda3/envs/0108/lib/python3.6/site-packages/torch/distributed/launch.py", line 326, in main sigkill_handler(signal.SIGTERM, None) # not coming back File "/root/anaconda3/envs/0108/lib/python3.6/site-packages/torch/distributed/launch.py", line 301, in sigkill_handler raise subprocess.CalledProcessError(returncode=last_return_code, cmd=cmd) subprocess.CalledProcessError: Command '['/root/anaconda3/envs/0108/bin/python', '-u', 'train.py', '--local_rank=1', 'images', '-c', './configs/mobilenetv1_bn_uniform_reset_bn.yml']' returned non-zero exit status 1.

    opened by 6imust 1
  • project environment

    project environment

    Hi,could you provide the environment for the project?I try to train the network with python=3.8 pytorch=1.7.1,cuda=10.2.Shortly after starting training,there's a RuntimeError: CUDA error: device-side assert triggered happened,and some other environment also lead to this error.I'm not sure whether the problem is caused by the difference of environment.

    opened by singularity97 1
  • Softmax twice for SGS loss?

    Softmax twice for SGS loss?

    Dear authors, thanks for this nice work.

    I wonder why the calculation of the SGS loss is using the softmaxed data rather than the logits, considering the PyTorch CrossEntropyLoss already contains a softmax inside.

    https://github.com/changlin31/DS-Net/blob/15cd3036970ec27d2c306014344fd50d9e9b888b/dyn_slim/apis/train_slim_gate.py#L98 https://github.com/changlin31/DS-Net/blob/15cd3036970ec27d2c306014344fd50d9e9b888b/dyn_slim/models/dyn_slim_blocks.py#L324-L355

    opened by Yu-Zhewen 0
  • Can we futher improve autoalim without gate?

    Can we futher improve autoalim without gate?

    It is not easy to deploy gate operator with some other backends, like TensorRT.

    So my question is can we futher improve autoalim without the dynamic gate when inference?Any ongoing work are doing this?

    opened by twmht 3
  • DS-Net for object detection

    DS-Net for object detection

    Hello. Thanks for your work. I noticed that you also conducted some experiments in object detection. I wonder whether or when you will release the code

    opened by NoLookDefense 8
  • Dynamic path for DS-mobilenet

    Dynamic path for DS-mobilenet

    Hi. Thanks for your work. I am reading your paper and trying to reimplement, and I feel confused about some details. You mentioned in your paper that the slimming ratio ρ∈[0.35 : 0.05 : 1.25], which have 18 paths. However, in your code, there are only 14 paths ρ∈[0.35 : 0.05 : 1] as mentioned in https://github.com/changlin31/DS-Net/blob/15cd3036970ec27d2c306014344fd50d9e9b888b/dyn_slim/models/dyn_slim_net.py#L36 . And also, when conducting gate training, the gate function only has a 4-dimension output, meaning that there is only 4 paths and the slimming ratio is restricted to ρ∈[0.35 : 0.05 : 0.5]. https://github.com/changlin31/DS-Net/blob/15cd3036970ec27d2c306014344fd50d9e9b888b/dyn_slim/models/dyn_slim_blocks.py#L204 Why the dynamic path for larger network is not used?

    opened by NoLookDefense 1
Releases(v0.0.1)
  • v0.0.1(Nov 30, 2021)

    Pretrained weights of DS-MBNet supernet. Detailed accuracy of each sub-networks:

    | Subnetwork | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | | ----------------- | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | :---: | | MAdds | 133M | 153M | 175M | 200M | 226M | 255M | 286M | 319M | 355M | 393M | 433M | 475M | 519M | 565M | | Top-1 (%) | 70.1 | 70.4 | 70.8 | 71.2 | 71.6 | 72.0 | 72.4 | 72.7 | 73.0 | 73.3 | 73.6 | 73.9 | 74.1 | 74.6 | | Top-5 (%) | 89.4 | 89.6 | 89.9 | 90.2 | 90.3 | 90.6 | 90.9 | 91.0 | 91.2 | 91.4 | 91.5 | 91.7 | 91.8 | 92.0 |

    Source code(tar.gz)
    Source code(zip)
    DS_MBNet-70_1.pth.tar(60.93 MB)
    log-DS_MBNet-70_1.txt(6.12 KB)
Owner
Changlin Li
Changlin Li
NU-Wave: A Diffusion Probabilistic Model for Neural Audio Upsampling

NU-Wave: A Diffusion Probabilistic Model for Neural Audio Upsampling For Official repo of NU-Wave: A Diffusion Probabilistic Model for Neural Audio Up

Rishikesh (ऋषिकेश) 38 Oct 11, 2022
Predicting Axillary Lymph Node Metastasis in Early Breast Cancer Using Deep Learning on Primary Tumor Biopsy Slides

Predicting Axillary Lymph Node Metastasis in Early Breast Cancer Using Deep Learning on Primary Tumor Biopsy Slides Project | This repo is the officia

CVSM Group - email: <a href=[email protected]"> 33 Dec 28, 2022
A tool to visualise the results of AlphaFold2 and inspect the quality of structural predictions

AlphaFold Analyser This program produces high quality visualisations of predicted structures produced by AlphaFold. These visualisations allow the use

Oliver Powell 3 Nov 13, 2022
Toontown House CT Edition

Toontown House: Classic Toontown House Classic source that should just work. ❓ W

Open Source Toontown Servers 5 Jan 09, 2022
Code artifacts for the submission "Mind the Gap! A Study on the Transferability of Virtual vs Physical-world Testing of Autonomous Driving Systems"

Code Artifacts Code artifacts for the submission "Mind the Gap! A Study on the Transferability of Virtual vs Physical-world Testing of Autonomous Driv

Andrea Stocco 2 Aug 24, 2022
SOTR: Segmenting Objects with Transformers [ICCV 2021]

SOTR: Segmenting Objects with Transformers [ICCV 2021] By Ruohao Guo, Dantong Niu, Liao Qu, Zhenbo Li Introduction This is the official implementation

186 Dec 20, 2022
Decentralized Reinforcment Learning: Global Decision-Making via Local Economic Transactions (ICML 2020)

Decentralized Reinforcement Learning This is the code complementing the paper Decentralized Reinforcment Learning: Global Decision-Making via Local Ec

40 Oct 30, 2022
Source codes for Improved Few-Shot Visual Classification (CVPR 2020), Enhancing Few-Shot Image Classification with Unlabelled Examples

Source codes for Improved Few-Shot Visual Classification (CVPR 2020), Enhancing Few-Shot Image Classification with Unlabelled Examples (WACV 2022) and Beyond Simple Meta-Learning: Multi-Purpose Model

PLAI Group at UBC 42 Dec 06, 2022
A web porting for NVlabs' StyleGAN2, to facilitate exploring all kinds characteristic of StyleGAN networks

This project is a web porting for NVlabs' StyleGAN2, to facilitate exploring all kinds characteristic of StyleGAN networks. Thanks for NVlabs' excelle

K.L. 150 Dec 15, 2022
Step by Step on how to create an vision recognition model using LOBE.ai, export the model and run the model in an Azure Function

Step by Step on how to create an vision recognition model using LOBE.ai, export the model and run the model in an Azure Function

El Bruno 3 Mar 30, 2022
EmoTag helps you train emotion detection model for Chinese audios

emoTag emoTag helps you train emotion detection model for Chinese audios. Environment pip install -r requirement.txt Data We used Emotional Speech Dat

_zza 4 Sep 07, 2022
Official Implementation of "Learning Disentangled Behavior Embeddings"

DBE: Disentangled-Behavior-Embedding Official implementation of Learning Disentangled Behavior Embeddings (NeurIPS 2021). Environment requirement The

Mishne Lab 12 Sep 28, 2022
Supplemental Code for "ImpressionNet :A Multi view Approach to Predict Socio Facial Impressions"

Supplemental Code for "ImpressionNet :A Multi view Approach to Predict Socio Facial Impressions" Environment requirement This code is based on Python

Rohan Kumar Gupta 1 Dec 19, 2021
This repository implements and evaluates convolutional networks on the Möbius strip as toy model instantiations of Coordinate Independent Convolutional Networks.

Orientation independent Möbius CNNs This repository implements and evaluates convolutional networks on the Möbius strip as toy model instantiations of

Maurice Weiler 59 Dec 09, 2022
Data and extra materials for the food safety publications classifier

Data and extra materials for the food safety publications classifier The subdirectories contain detailed descriptions of their contents in the README.

1 Jan 20, 2022
A Factor Model for Persistence in Investment Manager Performance

Factor-Model-Manager-Performance A Factor Model for Persistence in Investment Manager Performance I apply methods and processes similar to those used

Omid Arhami 1 Dec 01, 2021
Implementation of Bagging and AdaBoost Algorithm

Bagging-and-AdaBoost Implementation of Bagging and AdaBoost Algorithm Dataset Red Wine Quality Data Sets For simplicity, we will have 2 classes of win

Zechen Ma 1 Nov 01, 2021
PyTorch implementation of Neural Combinatorial Optimization with Reinforcement Learning.

neural-combinatorial-rl-pytorch PyTorch implementation of Neural Combinatorial Optimization with Reinforcement Learning. I have implemented the basic

Patrick E. 454 Jan 06, 2023
Regulatory Instruments for Fair Personalized Pricing.

Fair pricing Source code for WWW 2022 paper Regulatory Instruments for Fair Personalized Pricing. Installation Requirements Linux with Python = 3.6 p

Renzhe Xu 6 Oct 26, 2022
A collection of inference modules for fastai2

fastinference A collection of inference modules for fastai including inference speedup and interpretability Install pip install fastinference There ar

Zachary Mueller 83 Oct 10, 2022