Code and checkpoints for training the transformer-based Table QA models introduced in the paper TAPAS: Weakly Supervised Table Parsing via Pre-training.

Overview

TAble PArSing (TAPAS)

Code and checkpoints for training the transformer-based Table QA models introduced in the paper TAPAS: Weakly Supervised Table Parsing via Pre-training.

News

2021/08/24

  • Added a colab to try predictions on open domain question answering.

2021/08/20

2021/07/23

2021/05/13

2021/03/23

2020/12/17

2020/10/19

  • Small change to WTQ training example creation
    • Questions with ambiguous cell matches will now be discarded
    • This improves denotation accuracy by ~1 point
    • For more details see this issue.
  • Added option to filter table columns by textual overlap with question

2020/10/09

2020/08/26

  • Added a colab to try predictions on WTQ

2020/08/05

  • New pre-trained models (see Data section below)
  • reset_position_index_per_cell: New option that allows to train models that instead of using absolute position indices reset the position index when a new cell starts.

2020/06/10

  • Bump TensorFlow to v2.2

2020/06/08

2020/05/07

  • Added a colab to try predictions on SQA

Installation

The easiest way to try out TAPAS with free GPU/TPU is in our Colab, which shows how to do predictions on SQA.

The repository uses protocol buffers, and requires the protoc compiler to run. You can download the latest binary for your OS here. On Ubuntu/Debian, it can be installed with:

sudo apt-get install protobuf-compiler

Afterwards, clone and install the git repository:

git clone https://github.com/google-research/tapas
cd tapas
pip install -e .

To run the test suite we use the tox library which can be run by calling:

pip install tox
tox

Models

We provide pre-trained models for different model sizes.

The metrics are computed by our tool and not the official metrics of the respective tasks. We provide them so one can verify whether one's own runs are in the right ballpark. They are medians over three individual runs.

Models with intermediate pre-training (2020/10/07).

New models based on the ideas discussed in Understanding tables with intermediate pre-training. Learn more about the methods use here.

WTQ

Trained from Mask LM, intermediate data, SQA, WikiSQL.

Size Reset Dev Accuracy Link
LARGE noreset 0.5062 tapas_wtq_wikisql_sqa_inter_masklm_large.zip
LARGE reset 0.5097 tapas_wtq_wikisql_sqa_inter_masklm_large_reset.zip
BASE noreset 0.4525 tapas_wtq_wikisql_sqa_inter_masklm_base.zip
BASE reset 0.4638 tapas_wtq_wikisql_sqa_inter_masklm_base_reset.zip
MEDIUM noreset 0.4324 tapas_wtq_wikisql_sqa_inter_masklm_medium.zip
MEDIUM reset 0.4324 tapas_wtq_wikisql_sqa_inter_masklm_medium_reset.zip
SMALL noreset 0.3681 tapas_wtq_wikisql_sqa_inter_masklm_small.zip
SMALL reset 0.3762 tapas_wtq_wikisql_sqa_inter_masklm_small_reset.zip
MINI noreset 0.2783 tapas_wtq_wikisql_sqa_inter_masklm_mini.zip
MINI reset 0.2854 tapas_wtq_wikisql_sqa_inter_masklm_mini_reset.zip
TINY noreset 0.0823 tapas_wtq_wikisql_sqa_inter_masklm_tiny.zip
TINY reset 0.1039 tapas_wtq_wikisql_sqa_inter_masklm_tiny_reset.zip

WIKISQL

Trained from Mask LM, intermediate data, SQA.

Size Reset Dev Accuracy Link
LARGE noreset 0.8948 tapas_wikisql_sqa_inter_masklm_large.zip
LARGE reset 0.8979 tapas_wikisql_sqa_inter_masklm_large_reset.zip
BASE noreset 0.8859 tapas_wikisql_sqa_inter_masklm_base.zip
BASE reset 0.8855 tapas_wikisql_sqa_inter_masklm_base_reset.zip
MEDIUM noreset 0.8766 tapas_wikisql_sqa_inter_masklm_medium.zip
MEDIUM reset 0.8773 tapas_wikisql_sqa_inter_masklm_medium_reset.zip
SMALL noreset 0.8552 tapas_wikisql_sqa_inter_masklm_small.zip
SMALL reset 0.8615 tapas_wikisql_sqa_inter_masklm_small_reset.zip
MINI noreset 0.8063 tapas_wikisql_sqa_inter_masklm_mini.zip
MINI reset 0.82 tapas_wikisql_sqa_inter_masklm_mini_reset.zip
TINY noreset 0.3198 tapas_wikisql_sqa_inter_masklm_tiny.zip
TINY reset 0.6046 tapas_wikisql_sqa_inter_masklm_tiny_reset.zip

TABFACT

Trained from Mask LM, intermediate data.

Size Reset Dev Accuracy Link
LARGE noreset 0.8101 tapas_tabfact_inter_masklm_large.zip
LARGE reset 0.8159 tapas_tabfact_inter_masklm_large_reset.zip
BASE noreset 0.7856 tapas_tabfact_inter_masklm_base.zip
BASE reset 0.7918 tapas_tabfact_inter_masklm_base_reset.zip
MEDIUM noreset 0.7585 tapas_tabfact_inter_masklm_medium.zip
MEDIUM reset 0.7587 tapas_tabfact_inter_masklm_medium_reset.zip
SMALL noreset 0.7321 tapas_tabfact_inter_masklm_small.zip
SMALL reset 0.7346 tapas_tabfact_inter_masklm_small_reset.zip
MINI noreset 0.6166 tapas_tabfact_inter_masklm_mini.zip
MINI reset 0.6845 tapas_tabfact_inter_masklm_mini_reset.zip
TINY noreset 0.5425 tapas_tabfact_inter_masklm_tiny.zip
TINY reset 0.5528 tapas_tabfact_inter_masklm_tiny_reset.zip

SQA

Trained from Mask LM, intermediate data.

Size Reset Dev Accuracy Link
LARGE noreset 0.7223 tapas_sqa_inter_masklm_large.zip
LARGE reset 0.7289 tapas_sqa_inter_masklm_large_reset.zip
BASE noreset 0.6737 tapas_sqa_inter_masklm_base.zip
BASE reset 0.6874 tapas_sqa_inter_masklm_base_reset.zip
MEDIUM noreset 0.6464 tapas_sqa_inter_masklm_medium.zip
MEDIUM reset 0.6561 tapas_sqa_inter_masklm_medium_reset.zip
SMALL noreset 0.5876 tapas_sqa_inter_masklm_small.zip
SMALL reset 0.6155 tapas_sqa_inter_masklm_small_reset.zip
MINI noreset 0.4574 tapas_sqa_inter_masklm_mini.zip
MINI reset 0.5148 tapas_sqa_inter_masklm_mini_reset.zip
TINY noreset 0.2004 tapas_sqa_inter_masklm_tiny.zip
TINY reset 0.2375 tapas_sqa_inter_masklm_tiny_reset.zip

INTERMEDIATE

Trained from Mask LM.

Size Reset Dev Accuracy Link
LARGE noreset 0.9309 tapas_inter_masklm_large.zip
LARGE reset 0.9317 tapas_inter_masklm_large_reset.zip
BASE noreset 0.9134 tapas_inter_masklm_base.zip
BASE reset 0.9163 tapas_inter_masklm_base_reset.zip
MEDIUM noreset 0.8988 tapas_inter_masklm_medium.zip
MEDIUM reset 0.9005 tapas_inter_masklm_medium_reset.zip
SMALL noreset 0.8788 tapas_inter_masklm_small.zip
SMALL reset 0.8798 tapas_inter_masklm_small_reset.zip
MINI noreset 0.8218 tapas_inter_masklm_mini.zip
MINI reset 0.8333 tapas_inter_masklm_mini_reset.zip
TINY noreset 0.6359 tapas_inter_masklm_tiny.zip
TINY reset 0.6615 tapas_inter_masklm_tiny_reset.zip

Small Models & position index reset (2020/08/08)

Based on the pre-trained checkpoints available at the BERT github page. See the page or the paper for detailed information on the model dimensions.

Reset refers to whether the parameter reset_position_index_per_cell was set to true or false during training. In general it's recommended to set it to true.

The accuracy depends on the respective task. It's denotation accuracy for WTQ and WIKISQL, average position accuracy with gold labels for the previous answers for SQA and Mask-LM accuracy for Mask-LM.

The models were trained in a chain as indicated by the model name. For example, sqa_masklm means the model was first trained on the Mask-LM task and then on SQA. No destillation was performed.

WTQ

Size Reset Dev Accuracy Link
LARGE noreset 0.4822 tapas_wtq_wikisql_sqa_masklm_large.zip
LARGE reset 0.4952 tapas_wtq_wikisql_sqa_masklm_large_reset.zip
BASE noreset 0.4288 tapas_wtq_wikisql_sqa_masklm_base.zip
BASE reset 0.4433 tapas_wtq_wikisql_sqa_masklm_base_reset.zip
MEDIUM noreset 0.4158 tapas_wtq_wikisql_sqa_masklm_medium.zip
MEDIUM reset 0.4097 tapas_wtq_wikisql_sqa_masklm_medium_reset.zip
SMALL noreset 0.3267 tapas_wtq_wikisql_sqa_masklm_small.zip
SMALL reset 0.3670 tapas_wtq_wikisql_sqa_masklm_small_reset.zip
MINI noreset 0.2275 tapas_wtq_wikisql_sqa_masklm_mini.zip
MINI reset 0.2409 tapas_wtq_wikisql_sqa_masklm_mini_reset.zip
TINY noreset 0.0901 tapas_wtq_wikisql_sqa_masklm_tiny.zip
TINY reset 0.0947 tapas_wtq_wikisql_sqa_masklm_tiny_reset.zip

WIKISQL

Size Reset Dev Accuracy Link
LARGE noreset 0.8862 tapas_wikisql_sqa_masklm_large.zip
LARGE reset 0.8917 tapas_wikisql_sqa_masklm_large_reset.zip
BASE noreset 0.8772 tapas_wikisql_sqa_masklm_base.zip
BASE reset 0.8809 tapas_wikisql_sqa_masklm_base_reset.zip
MEDIUM noreset 0.8687 tapas_wikisql_sqa_masklm_medium.zip
MEDIUM reset 0.8736 tapas_wikisql_sqa_masklm_medium_reset.zip
SMALL noreset 0.8285 tapas_wikisql_sqa_masklm_small.zip
SMALL reset 0.8550 tapas_wikisql_sqa_masklm_small_reset.zip
MINI noreset 0.7672 tapas_wikisql_sqa_masklm_mini.zip
MINI reset 0.7944 tapas_wikisql_sqa_masklm_mini_reset.zip
TINY noreset 0.3237 tapas_wikisql_sqa_masklm_tiny.zip
TINY reset 0.3608 tapas_wikisql_sqa_masklm_tiny_reset.zip

SQA

Size Reset Dev Accuracy Link
LARGE noreset 0.7002 tapas_sqa_masklm_large.zip
LARGE reset 0.7130 tapas_sqa_masklm_large_reset.zip
BASE noreset 0.6393 tapas_sqa_masklm_base.zip
BASE reset 0.6689 tapas_sqa_masklm_base_reset.zip
MEDIUM noreset 0.6026 tapas_sqa_masklm_medium.zip
MEDIUM reset 0.6141 tapas_sqa_masklm_medium_reset.zip
SMALL noreset 0.4976 tapas_sqa_masklm_small.zip
SMALL reset 0.5589 tapas_sqa_masklm_small_reset.zip
MINI noreset 0.3779 tapas_sqa_masklm_mini.zip
MINI reset 0.3687 tapas_sqa_masklm_mini_reset.zip
TINY noreset 0.2013 tapas_sqa_masklm_tiny.zip
TINY reset 0.2194 tapas_sqa_masklm_tiny_reset.zip

MASKLM

Size Reset Dev Accuracy Link
LARGE noreset 0.7513 tapas_masklm_large.zip
LARGE reset 0.7528 tapas_masklm_large_reset.zip
BASE noreset 0.7323 tapas_masklm_base.zip
BASE reset 0.7335 tapas_masklm_base_reset.zip
MEDIUM noreset 0.7059 tapas_masklm_medium.zip
MEDIUM reset 0.7054 tapas_masklm_medium_reset.zip
SMALL noreset 0.6818 tapas_masklm_small.zip
SMALL reset 0.6856 tapas_masklm_small_reset.zip
MINI noreset 0.6382 tapas_masklm_mini.zip
MINI reset 0.6425 tapas_masklm_mini_reset.zip
TINY noreset 0.4826 tapas_masklm_tiny.zip
TINY reset 0.5282 tapas_masklm_tiny_reset.zip

Original Models

The pre-trained TAPAS checkpoints can be downloaded here:

The first two models are pre-trained on the Mask-LM task and the last two on the Mask-LM task first and SQA second.

Fine-Tuning Data

You also need to download the task data for the fine-tuning tasks:

Pre-Training

Note that you can skip pre-training and just use one of the pre-trained checkpoints provided above.

Information about the pre-taining data can be found here.

The TF examples for pre-training can be created using Google Dataflow:

python3 setup.py sdist
python3 tapas/create_pretrain_examples_main.py \
  --input_file="gs://tapas_models/2020_05_11/interactions.txtpb.gz" \
  --vocab_file="gs://tapas_models/2020_05_11/vocab.txt" \
  --output_dir="gs://your_bucket/output" \
  --runner_type="DATAFLOW" \
  --gc_project="you-project" \
  --gc_region="us-west1" \
  --gc_job_name="create-pretrain" \
  --gc_staging_location="gs://your_bucket/staging" \
  --gc_temp_location="gs://your_bucket/tmp" \
  --extra_packages=dist/tapas-0.0.1.dev0.tar.gz

You can also run the pipeline locally but that will take a long time:

python3 tapas/create_pretrain_examples_main.py \
  --input_file="$data/interactions.txtpb.gz" \
  --output_dir="$data/" \
  --vocab_file="$data/vocab.txt" \
  --runner_type="DIRECT"

This will create two tfrecord files for training and testing. The pre-training can then be started with the command below. The init checkpoint should be a standard BERT checkpoint.

python3 tapas/experiments/tapas_pretraining_experiment.py \
  --eval_batch_size=32 \
  --train_batch_size=512 \
  --tpu_iterations_per_loop=5000 \
  --num_eval_steps=100 \
  --save_checkpoints_steps=5000 \
  --num_train_examples=512000000 \
  --max_seq_length=128 \
  --input_file_train="${data}/train.tfrecord" \
  --input_file_eval="${data}/test.tfrecord" \
  --init_checkpoint="${tapas_data_dir}/model.ckpt" \
  --bert_config_file="${tapas_data_dir}/bert_config.json" \
  --model_dir="..." \
  --compression_type="" \
  --do_train

Where compression_type should be set to GZIP if the tfrecords are compressed. You can start a separate eval job by setting --nodo_train --doeval.

Running a fine-tuning task

We need to create the TF examples before starting the training. For example, for SQA that would look like:

python3 tapas/run_task_main.py \
  --task="SQA" \
  --input_dir="${sqa_data_dir}" \
  --output_dir="${output_dir}" \
  --bert_vocab_file="${tapas_data_dir}/vocab.txt" \
  --mode="create_data"

Optionally, to handle big tables, we can add a --prune_columns flag to apply the HEM method described section 3.3 of our paper to discard some columns based on textual overlap with the sentence.

Afterwards, training can be started by running:

python3 tapas/run_task_main.py \
  --task="SQA" \
  --output_dir="${output_dir}" \
  --init_checkpoint="${tapas_data_dir}/model.ckpt" \
  --bert_config_file="${tapas_data_dir}/bert_config.json" \
  --mode="train" \
  --use_tpu

This will use the preset hyper-parameters set in hparam_utils.py.

It's recommended to start a separate eval job to continuously produce predictions for the checkpoints created by the training job. Alternatively, you can run the eval job after training to only get the final results.

python3 tapas/run_task_main.py \
  --task="SQA" \
  --output_dir="${output_dir}" \
  --init_checkpoint="${tapas_data_dir}/model.ckpt" \
  --bert_config_file="${tapas_data_dir}/bert_config.json" \
  --mode="predict_and_evaluate"

Another tool to run experiments is tapas_classifier_experiment.py. It's more flexible than run_task_main.py but also requires setting all the hyper-parameters (via the respective command line flags).

Evaluation

Here we explain some details about different tasks.

SQA

By default, SQA will evaluate using the reference answers of the previous questions. The number in the paper (Table 5) are computed using the more realistic setup where the previous answer are model predictions. run_task_main.py will output additional prediction files for this setup as well if run on GPU.

WTQ

For the official evaluation results one should convert the TAPAS predictions to the WTQ format and run the official evaluation script. This can be done using convert_predictions.py.

WikiSQL

As discussed in the paper our code will compute evaluation metrics that deviate from the official evaluation script (Table 3 and 10).

Hardware Requirements

TAPAS is essentialy a BERT model and thus has the same requirements. This means that training the large model with 512 sequence length will require a TPU. You can use the option max_seq_length to create shorter sequences. This will reduce accuracy but also make the model trainable on GPUs. Another option is to reduce the batch size (train_batch_size), but this will likely also affect accuracy. We added an options gradient_accumulation_steps that allows you to split the gradient over multiple batches. Evaluation with the default test batch size (32) should be possible on GPU.

How to cite TAPAS?

You can cite the ACL 2020 paper and the EMNLP 2020 Findings paper for the laters work on pre-training objectives.

Disclaimer

This is not an official Google product.

Contact information

For help or issues, please submit a GitHub issue.

Owner
Google Research
Google Research
Chinese NER with albert/electra or other bert descendable model (keras)

Chinese NLP (albert/electra with Keras) Named Entity Recognization Project Structure ./ ├── NER │   ├── __init__.py │   ├── log

2 Nov 20, 2022
Python library for Serbian Natural language processing (NLP)

SrbAI - Python biblioteka za procesiranje srpskog jezika SrbAI je projekat prikupljanja algoritama i modela za procesiranje srpskog jezika u jedinstve

Serbian AI Society 3 Nov 22, 2022
fastNLP: A Modularized and Extensible NLP Framework. Currently still in incubation.

fastNLP fastNLP是一款轻量级的自然语言处理(NLP)工具包,目标是快速实现NLP任务以及构建复杂模型。 fastNLP具有如下的特性: 统一的Tabular式数据容器,简化数据预处理过程; 内置多种数据集的Loader和Pipe,省去预处理代码; 各种方便的NLP工具,例如Embedd

fastNLP 2.8k Jan 01, 2023
This is my reading list for my PhD in AI, NLP, Deep Learning and more.

This is my reading list for my PhD in AI, NLP, Deep Learning and more.

Zhong Peixiang 156 Dec 21, 2022
NVDA, the free and open source Screen Reader for Microsoft Windows

NVDA NVDA (NonVisual Desktop Access) is a free, open source screen reader for Microsoft Windows. It is developed by NV Access in collaboration with a

NV Access 1.6k Jan 07, 2023
Google and Stanford University released a new pre-trained model called ELECTRA

Google and Stanford University released a new pre-trained model called ELECTRA, which has a much compact model size and relatively competitive performance compared to BERT and its variants. For furth

Yiming Cui 1.2k Dec 30, 2022
Code for using and evaluating SpanBERT.

SpanBERT This repository contains code and models for the paper: SpanBERT: Improving Pre-training by Representing and Predicting Spans. If you prefer

Meta Research 798 Dec 30, 2022
Implementation of Multistream Transformers in Pytorch

Multistream Transformers Implementation of Multistream Transformers in Pytorch. This repository deviates slightly from the paper, where instead of usi

Phil Wang 47 Jul 26, 2022
LV-BERT: Exploiting Layer Variety for BERT (Findings of ACL 2021)

LV-BERT Introduction In this repo, we introduce LV-BERT by exploiting layer variety for BERT. For detailed description and experimental results, pleas

Weihao Yu 14 Aug 24, 2022
This is a NLP based project to extract effective date of the contract from their text files.

Date-Extraction-from-Contracts This is a NLP based project to extract effective date of the contract from their text files. Problem statement This is

Sambhav Garg 1 Jan 26, 2022
CDLA: A Chinese document layout analysis (CDLA) dataset

CDLA: A Chinese document layout analysis (CDLA) dataset 介绍 CDLA是一个中文文档版面分析数据集,面向中文文献类(论文)场景。包含以下10个label: 正文 标题 图片 图片标题 表格 表格标题 页眉 页脚 注释 公式 Text Title

buptlihang 84 Dec 28, 2022
Translation to python of Chris Sims' optimization function

pycsminwel This is a locol minimization algorithm. Uses a quasi-Newton method with BFGS update of the estimated inverse hessian. It is robust against

Gustavo Amarante 1 Mar 21, 2022
Text Analysis & Topic Extraction on Android App user reviews

AndroidApp_TextAnalysis Hi, there! This is code archive for Text Analysis and Topic Extraction from user_reviews of Android App. Dataset Source : http

Fitrie Ratnasari 1 Feb 14, 2022
DAGAN - Dual Attention GANs for Semantic Image Synthesis

Contents Semantic Image Synthesis with DAGAN Installation Dataset Preparation Generating Images Using Pretrained Model Train and Test New Models Evalu

Hao Tang 104 Oct 08, 2022
Label data using HuggingFace's transformers and automatically get a prediction service

Label Studio for Hugging Face's Transformers Website • Docs • Twitter • Join Slack Community Transfer learning for NLP models by annotating your textu

Heartex 135 Dec 29, 2022
An ActivityWatch watcher to pose questions to the user and record her answers.

aw-watcher-ask An ActivityWatch watcher to pose questions to the user and record her answers. This watcher uses Zenity to present dialog boxes to the

Bernardo Chrispim Baron 33 Dec 03, 2022
Bidirectional LSTM-CRF and ELMo for Named-Entity Recognition, Part-of-Speech Tagging and so on.

anaGo anaGo is a Python library for sequence labeling(NER, PoS Tagging,...), implemented in Keras. anaGo can solve sequence labeling tasks such as nam

Hiroki Nakayama 1.5k Dec 05, 2022
A Semi-Intelligent ChatBot filled with statistical and economical data for the Premier League.

MONEYBALL - ChatBot Module: 4006CEM, Class: B, Group: 5 Contributors: Jonas Djondo Roshan Kc Cole Samson Daniel Rodrigues Ihteshaam Naseer Kind remind

Jonas Djondo 1 Nov 18, 2021
Mapping a variable-length sentence to a fixed-length vector using BERT model

Are you looking for X-as-service? Try the Cloud-Native Neural Search Framework for Any Kind of Data bert-as-service Using BERT model as a sentence enc

Han Xiao 11.1k Jan 01, 2023
☀️ Measuring the accuracy of BBC weather forecasts in Honolulu, USA

Accuracy of BBC Weather forecasts for Honolulu This repository records the forecasts made by BBC Weather for the city of Honolulu, USA. Essentially, t

Max Halford 12 Oct 15, 2022