Official Repsoitory for "Activate or Not: Learning Customized Activation." [CVPR 2021]

Last update: Dec 27, 2022

Related tags

Overview

CVPR 2021 | Activate or Not: Learning Customized Activation.

This repository contains the official Pytorch implementation of the paper Activate or Not: Learning Customized Activation, CVPR 2021.

ACON

We propose a novel activation function we term the ACON that explicitly learns to activate the neurons or not. Below we show the ACON activation function and its first derivatives. β controls how fast the first derivative asymptotes to the upper/lower bounds, which are determined by p1 and p2.

Training curves

We show the training curves of different activations here.

TFNet

To show the effectiveness of the proposed acon family, we also provide an extreme simple toy funnel network (TFNet) made only by pointwise convolution and ACON-FReLU operators.

Main results

The following results are the ImageNet top-1 accuracy relative improvements compared with the ReLU baselines. The relative improvements of Meta-ACON are about twice as much as SENet.

The comparison between ReLU, Swish and ACON-C. We show improvements without additional amount of FLOPs and parameters:

Model	FLOPs	#Params.	top-1 err. (ReLU)	top-1 err. (Swish)	top-1 err. (ACON)
ShuffleNetV2 0.5x	41M	1.4M	39.4	38.3 (+1.1)	37.0 (+2.4)
ShuffleNetV2 1.5x	299M	3.5M	27.4	26.8 (+0.6)	26.5 (+0.9)
ResNet 50	3.9G	25.5M	24.0	23.5 (+0.5)	23.2 (+0.8)
ResNet 101	7.6G	44.4M	22.8	22.7 (+0.1)	21.8 (+1.0)
ResNet 152	11.3G	60.0M	22.3	22.2 (+0.1)	21.2 (+1.1)

Next, by adding a negligible amount of FLOPs and parameters, meta-ACON shows sigificant improvements:

Model	FLOPs	#Params.	top-1 err.
ShuffleNetV2 0.5x (meta-acon)	41M	1.7M	34.8 (+4.6)
ShuffleNetV2 1.5x (meta-acon)	299M	3.9M	24.7 (+2.7)
ResNet 50 (meta-acon)	3.9G	25.7M	22.0 (+2.0)
ResNet 101 (meta-acon)	7.6G	44.8M	21.0 (+1.8)
ResNet 152 (meta-acon)	11.3G	60.5M	20.5 (+1.8)

The simple TFNet without the SE modules can outperform the state-of-the art light-weight networks without the SE modules.

	FLOPs	#Params.	top-1 err.
MobileNetV2 0.17	42M	1.4M	52.6
ShuffleNetV2 0.5x	41M	1.4M	39.4
TFNet 0.5	43M	1.3M	36.6 (+2.8)
MobileNetV2 0.6	141M	2.2M	33.3
ShuffleNetV2 1.0x	146M	2.3M	30.6
TFNet 1.0	135M	1.9M	29.7 (+0.9)
MobileNetV2 1.0	300M	3.4M	28.0
ShuffleNetV2 1.5x	299M	3.5M	27.4
TFNet 1.5	279M	2.7M	26.0 (+1.4)
MobileNetV2 1.4	585M	5.5M	25.3
ShuffleNetV2 2.0x	591M	7.4M	25.0
TFNet 2.0	474M	3.8M	24.3 (+0.7)

Trained Models

OneDrive download: Link
BaiduYun download: Link (extract code: 13fu)

Usage

Requirements

Download the ImageNet dataset and move validation images to labeled subfolders. To do this, you can use the following script: https://raw.githubusercontent.com/soumith/imagenetloader.torch/master/valprep.sh

Train:

python train.py  --train-dir YOUR_TRAINDATASET_PATH --val-dir YOUR_VALDATASET_PATH

Eval:

python train.py --eval --eval-resume YOUR_WEIGHT_PATH --train-dir YOUR_TRAINDATASET_PATH --val-dir YOUR_VALDATASET_PATH

Citation

If you use these models in your research, please cite:

@inproceedings{ma2021activate,
  title={Activate or Not: Learning Customized Activation},
  author={Ma, Ningning and Zhang, Xiangyu and Liu, Ming and Sun, Jian},
  booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},
  year={2021}
}

Official Repsoitory for "Activate or Not: Learning Customized Activation." [CVPR 2021]

Related tags

Overview

CVPR 2021 | Activate or Not: Learning Customized Activation.

ACON

Training curves

TFNet

Main results

Trained Models

Usage

Requirements

Citation

Owner

Vpw analyzer - A visual J1850 VPW analyzer written in Python

2021搜狐校园文本匹配算法大赛分比我们低的都是帅哥队

Sharpened cosine similarity torch - A Sharpened Cosine Similarity layer for PyTorch

Model serving at scale

AWS provides a Python SDK, "Boto3" ,which can be used to access the AWS-account from the local.

This is a deep learning-based method to segment deep brain structures and a brain mask from T1 weighted MRI.

A setup script to generate ITK Python Wheels

Meandering In Networks of Entities to Reach Verisimilar Answers

Inferred Model-based Fuzzer

a baseline to practice

Temporally Efficient Vision Transformer for Video Instance Segmentation, CVPR 2022, Oral

Campsite Reservation Finder

LVI-SAM: Tightly-coupled Lidar-Visual-Inertial Odometry via Smoothing and Mapping

Official code for paper "Optimization for Oriented Object Detection via Representation Invariance Loss".

Source for the paper "Universal Activation Function for machine learning"

Fibonacci Method Gradient Descent

Diabetes-Feature-Engineering - A machine learning model that can predict whether people have diabetes when their characteristics are specified

Official git repo for the CHIRP project

Official Implementation for Fast Training of Neural Lumigraph Representations using Meta Learning.

NaijaSenti is an open-source sentiment and emotion corpora for four major Nigerian languages

Official Repsoitory for "Activate or Not: Learning Customized Activation." [CVPR 2021]

Related tags

Overview

CVPR 2021 | Activate or Not: Learning Customized Activation.

ACON

Training curves

TFNet

Main results

Trained Models

Usage

Requirements

Citation

Owner

Vpw analyzer - A visual J1850 VPW analyzer written in Python

2021搜狐校园文本匹配算法大赛 分比我们低的都是帅哥队

Sharpened cosine similarity torch - A Sharpened Cosine Similarity layer for PyTorch

Model serving at scale

AWS provides a Python SDK, "Boto3" ,which can be used to access the AWS-account from the local.

This is a deep learning-based method to segment deep brain structures and a brain mask from T1 weighted MRI.

A setup script to generate ITK Python Wheels

Meandering In Networks of Entities to Reach Verisimilar Answers

Inferred Model-based Fuzzer

a baseline to practice

Temporally Efficient Vision Transformer for Video Instance Segmentation, CVPR 2022, Oral

Campsite Reservation Finder

LVI-SAM: Tightly-coupled Lidar-Visual-Inertial Odometry via Smoothing and Mapping

Official code for paper "Optimization for Oriented Object Detection via Representation Invariance Loss".

Source for the paper "Universal Activation Function for machine learning"

Fibonacci Method Gradient Descent

Diabetes-Feature-Engineering - A machine learning model that can predict whether people have diabetes when their characteristics are specified

Official git repo for the CHIRP project

Official Implementation for Fast Training of Neural Lumigraph Representations using Meta Learning.

NaijaSenti is an open-source sentiment and emotion corpora for four major Nigerian languages

2021搜狐校园文本匹配算法大赛分比我们低的都是帅哥队