Knowledge-Inheritance

Source code paper: Knowledge Inheritance for Pre-trained Language Models (preprint). The trained model parameters (in Fairseq format) can be downloaded from Tsinghua Cloud. You can use convert_fairseq_to_huggingface.py to convert the Fairseq format into Huggingface's transformers format easily.

We refer the downstream performance evaluation to the implementation of Fairseq (GLUE tasks) and Don't Stop Pre-training (ACL-ARC / CHEMPROT).

If you have any question, feel free to contact us ([email protected]).

1. Available Pretrained Models

WB domain: Wikipedia + BookCorpus; CS domain: computer science papers; BIO domain: biomedical papers;

Models trained by self-learning

RoBERTa_WB_H_4
RoBERTa_WB_H_6
RoBERTa_WB_H_8
RoBERTa_WB_H_10
RoBERTa_WB_D_288
RoBERTa_WB_D_384
RoBERTa_WB_D_480
RoBERTa_WB_D_576
RoBERTa_WB_D_672
RoBERTa_WB_BASE
RoBERTa_WB_MEDIUM
RoBERTa_WB_BASE_PLUS
RoBERTa_WB_LARGE
GPT_WB_MEDIUM
GPT_WB_BASE
GPT_WB_BASE_PLUS
RoBERTa_CS_MEDIUM
RoBERTa_CS_BASE
RoBERTa_BIO_MEDIUM
RoBERTa_BIO_BASE

Models trained by Knowledge Inheritance

RoBERTa_WB_BASE -> RoBERTa_WB_BASE_PLUS
RoBERTa_WB_BASE -> RoBERTa_WB_LARGE
RoBERTa_WB_BASE_PLUS -> RoBERTa_WB_LARGE
RoBERTa_WB_BASE -> RoBERTa_WB_BASE_PLUS -> RoBERTa_WB_LARGE

Source code for paper: Knowledge Inheritance for Pre-trained Language Models

Related tags

Overview

Knowledge-Inheritance

1. Available Pretrained Models

Models trained by self-learning

Models trained by Knowledge Inheritance

Owner

THUNLP

EfficientNetv2 TensorRT int8

PyTorch implementation of Graph Convolutional Networks in Feature Space for Image Deblurring and Super-resolution, IJCNN 2021.

DyStyle: Dynamic Neural Network for Multi-Attribute-Conditioned Style Editing

Training Structured Neural Networks Through Manifold Identification and Variance Reduction

MAg: a simple learning-based patient-level aggregation method for detecting microsatellite instability from whole-slide images

Dungeons and Dragons randomized content generator

A PyTorch implementation of a Factorization Machine module in cython.

A PyTorch implementation of "SelfGNN: Self-supervised Graph Neural Networks without explicit negative sampling"

ImageNet Adversarial Image Evaluation

Bridging the Gap between Label- and Reference based Synthesis(ICCV 2021)

Official re-implementation of the Calibrated Adversarial Refinement model described in the paper Calibrated Adversarial Refinement for Stochastic Semantic Segmentation

Trainable Bilateral Filter Layer (PyTorch)

Official repository for "Intriguing Properties of Vision Transformers" (2021)

Image inpainting using Gaussian Mixture Models

ManipulaTHOR, a framework that facilitates visual manipulation of objects using a robotic arm

Code for the CVPR2022 paper "Frequency-driven Imperceptible Adversarial Attack on Semantic Similarity"

Python Classes: Medical Insurance Project using Object Oriented Programming Concepts

Real-time 3D multi-person detection made easy with OpenPose and the ZED

This is the implementation of our work Deep Extreme Cut (DEXTR), for object segmentation from extreme points.

Tutorial on active learning with the Nvidia Transfer Learning Toolkit (TLT).