This is the official pytorch implementation of Student Helping Teacher: Teacher Evolution via Self-Knowledge Distillation(TESKD)

Last update: Sep 26, 2022

Overview

Student Helping Teacher: Teacher Evolution via Self-Knowledge Distillation (TESKD)

By Zheng Li^[1,4], Xiang Li^[2], Lingfeng Yang^[2,4], Jian Yang^[2], Zhigeng Pan^[3]*.

^[1]Hangzhou Normal University, ^[2]Nanjing University of Science and Technology, ^[3]Nanjing University of Information Science and Technology, ^[4]MEGVII Technology

Email: [email protected]

Abstract

Different from the existing teacher-teaching-student and student-teaching-student paradigm, in this paper, we propose a novel student-helping-teacher formula, Teacher Evolution via Self-Knowledge Distillation(TESKD). The target backbone teacher network is constructed with multiple hierarchical student sub-networks in a FPN-like way, where each student shares various stages of teacher backbone features. The diverse feedback from multiple students allows the teacher to improve itself through the shared intermediate representations. The well-trained teacher is used for final deployment. With TESKD, the efficiency is significantly enhanced with simplified one-stage distillation procedure and improved model performance.

Overall Architecture An overview of our proposed TESKD. We divide the target backbone teacher into four blocks and construct three hierarchical student sub-networks #1, #2 and #3 in a FPN-like way by sharing various stages of the teacher backbone features.

Implementation

Student Helping Teacher: Teacher Evolution via Self-Knowledge Distillation(TESKD) https://arxiv.org/abs/2110.00329

This is the official pytorch implementation for the TESKD.

Requirements

Python3
Pytorch >=1.7.0
torchvision >= 0.8.1
numpy >=1.18.5
tqdm >=4.47.0

Training

In this code, you can reproduce the experiment results of classification task in the paper, including CIFAR-100 and ImageNet.

Running TESKD for ResNet18 on CIFAR-100 dataset.

(We run this experiment on a single machine that contains one NVIDIA GeForce RTX 2080Ti GPU)

python classification/main.py \
      --data_dir 'your_data_path'\
      --final_dir 'your_model_storage_path'\
      --name 'res18_our_cifar'\
      --model_name 'resnet_our'\
      --network_name 'cifarresnet18'\
      --data 'CIFAR100' \
      --batch_size 128 \
      --ce_weight 0.2 \
      --kd_weight 0.8 \
      --fea_weight 1e-7

This is the official pytorch implementation of Student Helping Teacher: Teacher Evolution via Self-Knowledge Distillation(TESKD)

Related tags

Overview

Student Helping Teacher: Teacher Evolution via Self-Knowledge Distillation (TESKD)

Abstract

Implementation

Requirements

Training

Owner

Zheng Li

StorSeismic: An approach to pre-train a neural network to store seismic data features

How to Leverage Multimodal EHR Data for Better Medical Predictions?

Multi-Stage Spatial-Temporal Convolutional Neural Network (MS-GCN)

This is the offical website for paper ''Category-consistent deep network learning for accurate vehicle logo recognition''

Air Quality Prediction Using LSTM

(Personalized) Page-Rank computation using PyTorch

Official Repsoitory for "Mish: A Self Regularized Non-Monotonic Neural Activation Function" [BMVC 2020]

This repository contains several jupyter notebooks to help users learn to use neon, our deep learning framework

This is the official Pytorch implementation of the paper "Diverse Motion Stylization for Multiple Style Domains via Spatial-Temporal Graph-Based Generative Model"

Fully-automated scripts for collecting AI-related papers

NCNN implementation of Real-ESRGAN. Real-ESRGAN aims at developing Practical Algorithms for General Image Restoration.

[ICLR 2021] Rank the Episodes: A Simple Approach for Exploration in Procedurally-Generated Environments.

This is the official Pytorch-version code of FlatGCN (Flattened Graph Convolutional Networks for Recommendation).

The Hailo Model Zoo includes pre-trained models and a full building and evaluation environment

Automatically erase objects in the video, such as logo, text, etc.

Semi-automated OpenVINO benchmark_app with variable parameters

The codes and models in 'Gaze Estimation using Transformer'.

[CVPR 2020] GAN Compression: Efficient Architectures for Interactive Conditional GANs

A Pytorch Implementation of a continuously rate adjustable learned image compression framework.

CR-Fill: Generative Image Inpainting with Auxiliary Contextual Reconstruction. ICCV 2021