Repository for Multimodal AutoML Benchmark

Last update: Nov 24, 2022

Overview

Benchmarking Multimodal AutoML for Tabular Data with Text Fields

Repository for the NeurIPS 2021 Dataset Track Submission "Benchmarking Multimodal AutoML for Tabular Data with Text Fields" (Link, Full Paper with Appendix). An earlier version of the paper, called "Multimodal AutoML on Structured Tables with Text Fields" (Link) has been accepted by ICML 2021 AutoML workshop as Oral. As we have since updated the benchmark with more datasets, the version used in the AutoML workshop paper has been archived at the icml_workshop branch.

This benchmark contains a diverse collection of tabular datasets. Each dataset contains numeric/categorical as well as text columns. The goal is to evaluate the performance of (automated) ML systems for supervised learning (classification and regression) with such multimodal data. The folder multimodal_text_benchmark/scripts/benchmark/ provides Python scripts to run different variants of the AutoGluon and H2O AutoML tools on the benchmark.

Datasets used in the Benchmark

Here's a brief summary of the datasets in our benchmark. Each dataset is described in greater detail in the multimodal_text_benchmark/ folder.

ID	key	#Train	#Test	Task	Metric	Prediction Target
prod	product_sentiment_machine_hack	5,091	1,273	multiclass	accuracy	sentiment related to product
salary	data_scientist_salary	15,84	3961	multiclass	accuracy	salary range in data scientist job listings
airbnb	melbourne_airbnb	18,316	4,579	multiclass	accuracy	price of Airbnb listing
channel	news_channel	20,284	5,071	multiclass	accuracy	category of news article
wine	wine_reviews	84,123	21,031	multiclass	accuracy	variety of wine
imdb	imdb_genre_prediction	800	200	binary	roc_auc	whether film is a drama
fake	fake_job_postings2	12,725	3,182	binary	roc_auc	whether job postings are fake
kick	kick_starter_funding	86,052	21,626	binary	roc_auc	will Kickstarter get funding
jigsaw	jigsaw_unintended_bias100K	100,000	25,000	binary	roc_auc	whether comments are toxic
qaa	google_qa_answer_type_reason_explanation	4,863	1,216	regression	r2	type of answer
qaq	google_qa_question_type_reason_explanation	4,863	1,216	regression	r2	type of question
book	bookprice_prediction	4,989	1,248	regression	r2	price of books
jc	jc_penney_products	10,860	2,715	regression	r2	price of JC Penney products
cloth	women_clothing_review	18,788	4,698	regression	r2	review score
ae	ae_price_prediction	22,662	5,666	regression	r2	American-Eagle item prices
pop	news_popularity2	24,007	6,002	regression	r2	news article popularity online
house	california_house_price	24,007	6,002	regression	r2	sale price of houses in California
mercari	mercari_price_suggestion100K	100,000	25,000	regression	r2	price of Mercari products

License

The versions of datasets in this benchmark are released under the CC BY-NC-SA license. Note that the datasets in this benchmark are modified versions of previously publicly-available original copies and we do not own any of the datasets in the benchmark. Any data from this benchmark which has previously been published elsewhere falls under the original license from which the data originated. Please refer to the licenses of each original source linked in the multimodal_text_benchmark/README.md.

Install the Benchmark Suite

cd multimodal_text_benchmark
# Install the benchmarking suite
python3 -m pip install -U -e .

You can do a quick test of the installation by going to the test folder

cd multimodal_text_benchmark/tests
python3 -m pytest test_datasets.py

To work with one of the datasets, use the following code:

from auto_mm_bench.datasets import dataset_registry

print(dataset_registry.list_keys())  # list of all dataset names
dataset_name = 'product_sentiment_machine_hack'

train_dataset = dataset_registry.create(dataset_name, 'train')
test_dataset = dataset_registry.create(dataset_name, 'test')
print(train_dataset.data)
print(test_dataset.data)

To access all datasets that comprise the benchmark:

from auto_mm_bench.datasets import create_dataset, TEXT_BENCHMARK_ALIAS_MAPPING

for dataset_name in list(TEXT_BENCHMARK_ALIAS_MAPPING.values()):
    print(dataset_name)
    dataset = create_dataset(dataset_name)

Run Experiments

Go to multimodal_text_benchmark/scripts/benchmark to see how to run some baseline ML methods over the benchmark.

References

BibTeX entry of the ICML Workshop Version:

@article{agmultimodaltext,
  title={Multimodal AutoML on Structured Tables with Text Fields},
  author={Shi, Xingjian and Mueller, Jonas and Erickson, Nick and Li, Mu and Smola, Alexander},
  journal={8th ICML Workshop on Automated Machine Learning (AutoML)},
  year={2021}
}

Repository for Multimodal AutoML Benchmark

Related tags

Overview

Benchmarking Multimodal AutoML for Tabular Data with Text Fields

Datasets used in the Benchmark

License

Install the Benchmark Suite

Run Experiments

References

Owner

Xingjian Shi

Crossover Learning for Fast Online Video Instance Segmentation (ICCV 2021)

CARL provides highly configurable contextual extensions to several well-known RL environments.

TabNet for fastai

Deploy optimized transformer based models on Nvidia Triton server

3D-aware GANs based on NeRF (arXiv).

Unofficial PyTorch Implementation of Multi-Singer

Code for Environment Dynamics Decomposition (ED2).

SIR model parameter estimation using a novel algorithm for differentiated uniformization.

3D Generative Adversarial Network

WaveFake: A Data Set to Facilitate Audio DeepFake Detection

Simple cross-platform application for DaVinci surgical video frame annotation

Memory-Augmented Model Predictive Control

The UI as a mobile display for OP25

fastgradio is a python library to quickly build and share gradio interfaces of your trained fastai models.

Using Self-Supervised Pretext Tasks for Active Learning - Official Pytorch Implementation

Free-duolingo-plus - Duolingo account creator that uses your invite code to get you free duolingo plus

An OpenAI-Gym Package for Training and Testing Reinforcement Learning algorithms with OpenSim Models

An End-to-End Machine Learning Library to Optimize AUC (AUROC, AUPRC).

torchbearer: A model fitting library for PyTorch

Code for the ICCV 2021 paper "Pixel Difference Networks for Efficient Edge Detection" (Oral).