MARS: Learning Modality-Agnostic Representation for Scalable Cross-media Retrieva

Last update: Aug 24, 2022

Related tags

Deep Learning MARS_TCSVT2021

Overview

Introduction

This is the source code of our TCSVT 2021 paper "MARS: Learning Modality-Agnostic Representation for Scalable Cross-media Retrieval". Please cite the following paper if you use our code.

Yunbo Wang and Yuxin Peng, "MARS: Learning Modality-Agnostic Representation for Scalable Cross-media Retrieval", IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), 2021.

Preparation

We use Python 3.7.2, PyTorch 1.1.0, cuda 9.0, and evaluate on Ubuntu 16.04.12

Install anaconda downloaded from https://repo.anaconda.com/archive. And create a new environment sh Anaconda3-2018.12-Linux-x86_64.sh conda create -n MARS python=3.7.2 conda activate MARS
Run the followed commands conda install pytorch==1.1.0 torchvision==0.3.0 cudatoolkit=9.0 -c pytorch pip install -r requirements.txt

Training and evaluation

We use the Wikipedia dataset as example, and the data is placed in ./datasets/Wiki. In addition, the XMedia&XMediaNet datasets are obtiand via http://59.108.48.34/tiki/XMediaNet/. The NUS-WIDE dataset is obtained via https://lms.comp.nus.edu.sg/wp-content/uploads/2019/research/nuswide/NUS-WIDE.html.

Run the followed command for traning&evaluation, and the configure can be found in main_MARS.py. python main_MARS.py --datasets wiki --output_shape 128 --batch_size 64 --epochs 50 --lr [1e-4, 5e-4] # for Wikipedia

The common representations can be found in folder "features".

For any questions, fell free to contact us. ([email protected])

Welcome to our Laboratory Homepage for more information.

MARS: Learning Modality-Agnostic Representation for Scalable Cross-media Retrieva

Related tags

Overview

Introduction

Preparation

Training and evaluation

Owner

HybridNets: End-to-End Perception Network

This is an early in-development version of training CLIP models with hivemind.

Hyperopt for solving CIFAR-100 with a convolutional neural network (CNN) built with Keras and TensorFlow, GPU backend

Acute ischemic stroke dataset

Fast Style Transfer in TensorFlow

LaBERT - A length-controllable and non-autoregressive image captioning model.

Code to reproduce the experiments in the paper "Transformer Based Multi-Source Domain Adaptation" (EMNLP 2020)

Enhancing Knowledge Tracing via Adversarial Training

Automatically align face images 🙃→🙂. Can also do windowing and warping.

Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context Code in both PyTorch and TensorFlow

banditml is a lightweight contextual bandit & reinforcement learning library designed to be used in production Python services.

CodeContests is a competitive programming dataset for machine-learning

Config files for my GitHub profile.

WarpDrive: Extremely Fast End-to-End Deep Multi-Agent Reinforcement Learning on a GPU

The implementation of "Shuffle Transformer: Rethinking Spatial Shuffle for Vision Transformer"

ZeroVL - The official implementation of ZeroVL

[ACL-IJCNLP 2021] "EarlyBERT: Efficient BERT Training via Early-bird Lottery Tickets"

Simple-Image-Classification - Simple Image Classification Code (PyTorch)

Instance-Dependent Partial Label Learning

Technical Indicators implemented in Python only using Numpy-Pandas as Magic - Very Very Fast! Very tiny! Stock Market Financial Technical Analysis Python library . Quant Trading automation or cryptocoin exchange