Deep Residual Networks with 1K Layers

By Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun.

Microsoft Research Asia (MSRA).

Introduction
Notes
Usage

Introduction

This repository contains re-implemented code for the paper "Identity Mappings in Deep Residual Networks" (http://arxiv.org/abs/1603.05027). This work enables training quality 1k-layer neural networks in a super simple way.

Acknowledgement: This code is re-implemented by Xiang Ming from Xi'an Jiaotong Univeristy for the ease of release.

Seel Also: Re-implementations of ResNet-200 [a] on ImageNet from Facebook AI Research (FAIR): https://github.com/facebook/fb.resnet.torch/tree/master/pretrained

Notes

This code is based on the implementation of Torch ResNets (https://github.com/facebook/fb.resnet.torch).
The experiments in the paper were conducted in Caffe, whereas this code is re-implemented in Torch. We observed similar results within reasonable statistical variations.
To fit the 1k-layer models into memory without modifying much code, we simply reduced the mini-batch size to 64, noting that results in the paper were obtained with a mini-batch size of 128. Less expectedly, the results with the mini-batch size of 64 are slightly better:

mini-batch CIFAR-10 test error (%): (median (mean+/-std))

128 (as in [a]) 4.92 (4.89+/-0.14)

64 (as in this code) 4.62 (4.69+/-0.20)
Curves obtained by running this code with a mini-batch size of 64 (training loss: y-axis on the left; test error: y-axis on the right):

mini-batch	CIFAR-10 test error (%): (median (mean+/-std))
128 (as in [a])	4.92 (4.89+/-0.14)
64 (as in this code)	4.62 (4.69+/-0.20)

Usage

Install Torch ResNets (https://github.com/facebook/fb.resnet.torch) following instructions therein.
Add the file resnet-pre-act.lua from this repository to ./models.
To train ResNet-1001 as of the form in [a]:

th main.lua -netType resnet-pre-act -depth 1001 -batchSize 64 -nGPU 2 -nThreads 4 -dataset cifar10 -nEpochs 200 -shareGradInput false

Note: ``shareGradInput=true'' is not valid for this model yet.

Deep Residual Networks with 1K Layers

Related tags

Overview

Deep Residual Networks with 1K Layers

Table of Contents

Introduction

Notes

Usage

Owner

Kaiming He

RTSeg: Real-time Semantic Segmentation Comparative Study

Detecting Potentially Harmful and Protective Suicide-related Content on Twitter

Top #1 Submission code for the first https://alphamev.ai MEV competition with best AUC (0.9893) and MSE (0.0982).

The second project in Python course on FCC

Hidden-Fold Networks (HFN): Random Recurrent Residuals Using Sparse Supermasks

Cossim - Sharpened Cosine Distance implementation in PyTorch

A PyTorch Extension: Tools for easy mixed precision and distributed training in Pytorch

Self-attentive task GAN for space domain awareness data augmentation.

Empower Sequence Labeling with Task-Aware Language Model

[NeurIPS 2021] Deceive D: Adaptive Pseudo Augmentation for GAN Training with Limited Data

Unofficial implementation of Google "CutPaste: Self-Supervised Learning for Anomaly Detection and Localization" in PyTorch

CMUA-Watermark: A Cross-Model Universal Adversarial Watermark for Combating Deepfakes (AAAI2022)

Official PyTorch implementation of our AAAI22 paper: TransMEF: A Transformer-Based Multi-Exposure Image Fusion Framework via Self-Supervised Multi-Task Learning. Code will be available soon.

GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation. (CVPR 2021)

🌾 PASTIS 🌾 Panoptic Agricultural Satellite TIme Series

Implement of homography net by pytorch

Sparse Physics-based and Interpretable Neural Networks

CN24 is a complete semantic segmentation framework using fully convolutional networks

A library for implementing Decentralized Graph Neural Network algorithms.

Auto-updating data to assist in investment to NEPSE