Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System

Related tags

Deep Learningpptod
Overview

Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System

Authors: Yixuan Su, Lei Shu, Elman Mansimov, Arshit Gupta, Deng Cai, Yi-An Lai, and Yi Zhang

Code our PPTOD paper: Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System

Introduction:

Pre-trained language models have been recently shown to benefit task-oriented dialogue (TOD) systems. Despite their success, existing methods often formulate this task as a cascaded generation problem which can lead to error accumulation across different sub-tasks and greater data annotation overhead. In this study, we present PPTOD, a unified model that seamlessly supports both task-oriented dialogue understanding and response generation in a plug-and-play fashion. In addition, we introduce a new dialogue multi-task pre-training strategy that allows the model to learn the primary TOD task completion skills from heterogeneous dialog corpora. We extensively test our model on three benchmark TOD tasks, including end-to-end dialogue modelling, dialogue state tracking, and intent classification. Results show that PPTOD creates new state-of-the-art on all evaluated tasks in both full training and low-resource scenarios. Furthermore, comparisons against previous SOTA methods show that the responses generated by PPTOD are more factually correct and semantically coherent as judged by human annotators.

Alt text

1. Citation

If you find our paper and resources useful, please kindly cite our paper:

  @article{su2021multitask,
    author    = {Yixuan Su and
                 Lei Shu and
                 Elman Mansimov and
                 Arshit Gupta and
                 Deng Cai and
                 Yi{-}An Lai and
                 Yi Zhang},
    title     = {Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System},
    journal   = {CoRR},
    volume    = {abs/2109.14739},
    year      = {2021},
    url       = {https://arxiv.org/abs/2109.14739},
    eprinttype = {arXiv},
    eprint    = {2109.14739}
  }

2. Environment Setup:

pip3 install -r requirements.txt
python -m spacy download en_core_web_sm

3. PPTOD Checkpoints:

You can download checkpoints of PPTOD with different configurations here.

PPTOD-small PPTOD-base PPTOD-large
here here here

To use PPTOD, you should download the checkpoint you want and unzip it in the ./checkpoints directory.

Alternatively, you can run the following commands to download the PPTOD checkpoints.

(1) Downloading Pre-trained PPTOD-small Checkpoint:

cd checkpoints
chmod +x ./download_pptod_small.sh
./download_pptod_small.sh

(2) Downloading Pre-trained PPTOD-base Checkpoint:

cd checkpoints
chmod +x ./download_pptod_base.sh
./download_pptod_base.sh

(3) Downloading Pre-trained PPTOD-large Checkpoint:

cd checkpoints
chmod +x ./download_pptod_large.sh
./download_pptod_large.sh

4. Data Preparation:

The detailed instruction for preparing the pre-training corpora and the data of downstream TOD tasks are provided in the ./data folder.

5. Dialogue Multi-Task Pre-training:

To pre-train a PPTOD model from scratch, please refer to details provided in ./Pretraining directory.

6. Benchmark TOD Tasks:

(1) End-to-End Dialogue Modelling:

To perform End-to-End Dialogue Modelling using PPTOD, please refer to details provided in ./E2E_TOD directory.

(2) Dialogue State Tracking:

To perform Dialogue State Tracking using PPTOD, please refer to details provided in ./DST directory.

(3) Intent Classification:

To perform Intent Classification using PPTOD, please refer to details provided in ./IC directory.

Security

See CONTRIBUTING for more information.

License

This project is licensed under the Apache-2.0 License.

Owner
Amazon Web Services - Labs
AWS Labs
Amazon Web Services - Labs
Benchmarks for Model-Based Optimization

Design-Bench Design-Bench is a benchmarking framework for solving automatic design problems that involve choosing an input that maximizes a black-box

Brandon Trabucco 43 Dec 20, 2022
A Python library created to assist programmers with complex mathematical functions

libmaths libmaths was created not only as a learning experience for me, but as a way to make mathematical models in seconds for Python users using mat

Simple 73 Oct 02, 2022
Efficient 3D Backbone Network for Temporal Modeling

VoV3D is an efficient and effective 3D backbone network for temporal modeling implemented on top of PySlowFast. Diverse Temporal Aggregation and

102 Dec 06, 2022
Additional functionality for use with fastai’s medical imaging module

fmi Adding additional functionality to fastai's medical imaging module To learn more about medical imaging using Fastai you can view my blog Install g

14 Oct 31, 2022
ECLARE: Extreme Classification with Label Graph Correlations

ECLARE ECLARE: Extreme Classification with Label Graph Correlations @InProceedings{Mittal21b, author = "Mittal, A. and Sachdeva, N. and Agrawal

Extreme Classification 35 Nov 06, 2022
An excellent hash algorithm combining classical sponge structure and RNN.

SHA-RNN Recurrent Neural Network with Chaotic System for Hash Functions Anonymous Authors [摘要] 在这次作业中我们提出了一种新的 Hash Function —— SHA-RNN。其以海绵结构为基础,融合了混

Houde Qian 5 May 15, 2022
We propose a new method for effective shadow removal by regarding it as an exposure fusion problem.

Auto-exposure fusion for single-image shadow removal We propose a new method for effective shadow removal by regarding it as an exposure fusion proble

Qing Guo 146 Dec 31, 2022
Painting app using Python machine learning and vision technology.

AI Painting App We are making an app that will track our hand and helps us to draw from that. We will be using the advance knowledge of Machine Learni

Badsha Laskar 3 Oct 03, 2022
This is the code repository implementing the paper "TreePartNet: Neural Decomposition of Point Clouds for 3D Tree Reconstruction".

TreePartNet This is the code repository implementing the paper "TreePartNet: Neural Decomposition of Point Clouds for 3D Tree Reconstruction". Depende

刘彦超 34 Nov 30, 2022
The official repository for BaMBNet

BaMBNet-Pytorch Paper

Junjun Jiang 18 Dec 04, 2022
A PyTorch implementation of "From Two to One: A New Scene Text Recognizer with Visual Language Modeling Network" (ICCV2021)

From Two to One: A New Scene Text Recognizer with Visual Language Modeling Network The official code of VisionLAN (ICCV2021). VisionLAN successfully a

81 Dec 12, 2022
Buffon’s needle: one of the oldest problems in geometric probability

Buffon-s-Needle Buffon’s needle is one of the oldest problems in geometric proba

3 Feb 18, 2022
LVI-SAM: Tightly-coupled Lidar-Visual-Inertial Odometry via Smoothing and Mapping

LVI-SAM This repository contains code for a lidar-visual-inertial odometry and mapping system, which combines the advantages of LIO-SAM and Vins-Mono

Tixiao Shan 1.1k Dec 27, 2022
Subgraph Based Learning of Contextual Embedding

SLiCE Self-Supervised Learning of Contextual Embeddings for Link Prediction in Heterogeneous Networks Dataset details: We use four public benchmark da

Pacific Northwest National Laboratory 27 Dec 01, 2022
Time-series-deep-learning - Developing Deep learning LSTM, BiLSTM models, and NeuralProphet for multi-step time-series forecasting of stock price.

Stock Price Prediction Using Deep Learning Univariate Time Series Predicting stock price using historical data of a company using Neural networks for

Abdultawwab Safarji 7 Nov 27, 2022
shufflev2-yolov5:lighter, faster and easier to deploy

shufflev2-yolov5: lighter, faster and easier to deploy. Evolved from yolov5 and the size of model is only 1.7M (int8) and 3.3M (fp16). It can reach 10+ FPS on the Raspberry Pi 4B when the input size

pogg 1.5k Jan 05, 2023
Implementation of Vaswani, Ashish, et al. "Attention is all you need."

Attention Is All You Need Paper Implementation This is my from-scratch implementation of the original transformer architecture from the following pape

Brando Koch 195 Dec 30, 2022
Flask101 - FullStack Web Development with Python & JS - From TAQWA

Task: Create a CLI Calculator Step 0: Creating Virtual Environment $ python -m

Hossain Foysal 1 May 31, 2022
Semi-supervised Domain Adaptation via Minimax Entropy

Semi-supervised Domain Adaptation via Minimax Entropy (ICCV 2019) Install pip install -r requirements.txt The code is written for Pytorch 0.4.0, but s

Vision and Learning Group 243 Jan 09, 2023
Boost learning for GNNs from the graph structure under challenging heterophily settings. (NeurIPS'20)

Beyond Homophily in Graph Neural Networks: Current Limitations and Effective Designs Jiong Zhu, Yujun Yan, Lingxiao Zhao, Mark Heimann, Leman Akoglu,

GEMS Lab: Graph Exploration & Mining at Scale, University of Michigan 70 Dec 18, 2022