FIGARO: Generating Symbolic Music with Fine-Grained Artistic Control

Related tags

Deep Learningfigaro
Overview

FIGARO: Generating Symbolic Music with Fine-Grained Artistic Control

by Dimitri von RĂŒtte, Luca Biggio, Yannic Kilcher, Thomas Hofmann

Getting started

Prerequisites:

  • Python 3.9
  • Conda

Setup

  1. Clone this repository to your disk
  2. Install required packages (see requirements.txt). With Conda:
conda create --name figaro python=3.9
conda activate figaro
pip install -r requirements.txt

Preparing the Data

To train models and to generate new samples, we use the Lakh MIDI dataset (altough any collection of MIDI files can be used).

  1. Download (size: 1.6GB) and extract the archive file:
wget http://hog.ee.columbia.edu/craffel/lmd/lmd_full.tar.gz
tar -xzf lmd_full.tar.gz
  1. You may wish to remove the archive file now: rm lmd_full.tar.gz

Download Pre-Trained Models

If you don't wish to train your own models, you can download our pre-trained models.

  1. Download (size: 2.3GB) and extract the archive file:
wget -O checkpoints.zip https://polybox.ethz.ch/index.php/s/a0HUHzKuPPefWkW/download
unzip checkpoints.zip
  1. You may wish to remove the archive file now: rm checkpoints.zip

Training

Training arguments such as model type, batch size, model params are passed to the training scripts via environment variables.

Available model types are:

  • vq-vae: VQ-VAE model used for the learned desription
  • figaro: FIGARO with both the expert and learned description
  • figaro-expert: FIGARO with only the expert description
  • figaro-learned: FIGARO with only the learned description
  • figaro-no-inst: FIGARO (expert) without instruments
  • figaro-no-chord: FIGARO (expert) without chords
  • figaro-no-meta: FIGARO (expert) without style (meta) information
  • baseline: Unconditional decoder-only baseline following Huang et al. (2018)

Example invocation of the training script is given by the following command:

MODEL=figaro-expert python src/train.py

For models using the learned description (figaro and figaro-learned), a pre-trained VQ-VAE checkpoint needs to be provided as well:

MODEL=figaro VAE_CHECKPOINT=./checkpoints/vq-vae.ckpt python src/train.py

Generation

To generate samples, make sure you have a trained checkpoint prepared (either download one or train it yourself). For this script, make sure that the dataset is prepared according to Preparing the Data. This is needed to extract descriptions, based on which new samples can be generated.

An example invocation of the generation script is given by the following command:

MODEL=figaro-expert CHECKPOINT=./checkpoints/figaro-expert.ckpt python src/generate.py

For models using the learned description (figaro and figaro-learned), a pre-trained VQ-VAE checkpoint needs to be provided as well:

MODEL=figaro CHECKPOINT=./checkpoints/figaro.ckpt VAE_CHECKPOINT=./checkpoints/vq-vae.ckpt python src/generate.py

Evaluation

We provide the evaluation scripts used to calculate the desription metrics on some set of generated samples. Refer to the previous section for how to generate samples yourself.

Example usage:

SAMPLE_DIR=./samples/figaro-expert python src/evaluate.py

Parameters

The following environment variables are available for controlling hyperparameters beyond their default value.

Training (train.py)

Model

Variable Description Default value
MODEL Model architecture to be trained
D_MODEL Hidden size of the model 512
CONTEXT_SIZE Number of tokens in the context to be passed to the auto-encoder 256
D_LATENT [VQ-VAE] Dimensionality of the latent space 1024
N_CODES [VQ-VAE] Codebook size 2048
N_GROUPS [VQ-VAE] Number of groups to split the latent vector into before discretization 16

Optimization

Variable Description Default value
EPOCHS Max. number of training epochs 16
MAX_TRAINING_STEPS Max. number of training iterations 100,000
BATCH_SIZE Number of samples in each batch 128
TARGET_BATCH_SIZE Number of samples in each backward step, gradients will be accumulated over TARGET_BATCH_SIZE//BATCH_SIZE batches 256
WARMUP_STEPS Number of learning rate warmup steps 4000
LEARNING_RATE Initial learning rate, will be decayed after constant warmup of WARMUP_STEPS steps 1e-4

Others

Variable Description Default value
CHECKPOINT Path to checkpoint from which to resume training
VAE_CHECKPOINT Path to VQ-VAE checkpoint to be used for the learned description
ROOT_DIR The folder containing MIDI files to train on ./lmd_full
OUTPUT_DIR Folder for saving checkpoints ./results
LOGGING_DIR Folder for saving logs ./logs
N_WORKERS Number of workers to be used for the dataloader available CPUs

Generation (generate.py)

Variable Description Default value
MODEL Specify which model will be loaded
CHECKPOINT Path to the checkpoint for the specified model
VAE_CHECKPOINT Path to the VQ-VAE checkpoint to be used for the learned description (if applicable)
ROOT_DIR Folder containing MIDI files to extract descriptions from ./lmd_full
OUTPUT_DIR Folder to save generated MIDI samples to ./samples
MAX_ITER Max. number of tokens that should be generated 16,000
MAX_BARS Max. number of bars that should be generated 32
MAKE_MEDLEYS Set to True if descriptions should be combined into medleys. False
N_MEDLEY_PIECES Number of pieces to be combined into one 2
N_MEDLEY_BARS Number of bars to take from each piece 16
VERBOSE Logging level, set to 0 for silent execution 2

Evaluation (evaluate.py)

Variable Description Default value
SAMPLE_DIR Folder containing generated samples which should be evaluated ./samples
OUT_FILE CSV file to which a detailed log of all metrics will be saved to ./metrics.csv
MAX_SAMPLES Limit the number of samples to be used for computing evaluation metrics 1024
Owner
Dimitri
Dimitri
TensorFlow implementation of the algorithm in the paper "Decoupled Low-light Image Enhancement"

Decoupled Low-light Image Enhancement Shijie Hao1,2*, Xu Han1,2, Yanrong Guo1,2 & Meng Wang1,2 1Key Laboratory of Knowledge Engineering with Big Data

17 Apr 25, 2022
[ICLR 2021 Spotlight Oral] "Undistillable: Making A Nasty Teacher That CANNOT teach students", Haoyu Ma, Tianlong Chen, Ting-Kuei Hu, Chenyu You, Xiaohui Xie, Zhangyang Wang

Undistillable: Making A Nasty Teacher That CANNOT teach students "Undistillable: Making A Nasty Teacher That CANNOT teach students" Haoyu Ma, Tianlong

VITA 71 Dec 28, 2022
Code for our method RePRI for Few-Shot Segmentation. Paper at http://arxiv.org/abs/2012.06166

Region Proportion Regularized Inference (RePRI) for Few-Shot Segmentation In this repo, we provide the code for our paper : "Few-Shot Segmentation Wit

Malik Boudiaf 138 Dec 12, 2022
An implementation of the [Hierarchical (Sig-Wasserstein) GAN] algorithm for large dimensional Time Series Generation

Hierarchical GAN for large dimensional financial market data Implementation This repository is an implementation of the [Hierarchical (Sig-Wasserstein

11 Nov 29, 2022
Implementation of Google Brain's WaveGrad high-fidelity vocoder

WaveGrad Implementation (PyTorch) of Google Brain's high-fidelity WaveGrad vocoder (paper). First implementation on GitHub with high-quality generatio

Ivan Vovk 363 Dec 27, 2022
Sign Language Transformers (CVPR'20)

Sign Language Transformers (CVPR'20) This repo contains the training and evaluation code for the paper Sign Language Transformers: Sign Language Trans

Necati Cihan Camgoz 164 Dec 30, 2022
TensorFlow 2 implementation of the Yahoo Open-NSFW model

TensorFlow 2 implementation of the Yahoo Open-NSFW model

Bosco Yung 101 Jan 01, 2023
Solutions and questions for AoC2021. Merry christmas!

Advent of Code 2021 Merry christmas! 🎄 🎅 To get solutions and approximate execution times for implementations, please execute the run.py script in t

Wilhelm Ågren 5 Dec 29, 2022
CAST: Character labeling in Animation using Self-supervision by Tracking

CAST: Character labeling in Animation using Self-supervision by Tracking (Published as a conference paper at EuroGraphics 2022) Note: The CAST paper c

15 Nov 18, 2022
Weight initialization schemes for PyTorch nn.Modules

nninit Weight initialization schemes for PyTorch nn.Modules. This is a port of the popular nninit for Torch7 by @kaixhin. ##Update This repo has been

Alykhan Tejani 69 Jan 26, 2021
Differentiable molecular simulation of proteins with a coarse-grained potential

Differentiable molecular simulation of proteins with a coarse-grained potential This repository contains the learned potential, simulation scripts and

UCL Bioinformatics Group 44 Dec 10, 2022
Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System

Multi-Task Pre-Training for Plug-and-Play Task-Oriented Dialogue System Authors: Yixuan Su, Lei Shu, Elman Mansimov, Arshit Gupta, Deng Cai, Yi-An Lai

Amazon Web Services - Labs 123 Dec 23, 2022
Code for the upcoming CVPR 2021 paper

The Temporal Opportunist: Self-Supervised Multi-Frame Monocular Depth Jamie Watson, Oisin Mac Aodha, Victor Prisacariu, Gabriel J. Brostow and Michael

Niantic Labs 496 Dec 30, 2022
The official implementation of the CVPR2021 paper: Decoupled Dynamic Filter Networks

Decoupled Dynamic Filter Networks This repo is the official implementation of CVPR2021 paper: "Decoupled Dynamic Filter Networks". Introduction DDF is

F.S.Fire 180 Dec 30, 2022
[NeurIPS 2020] Official repository for the project "Listening to Sound of Silence for Speech Denoising"

Listening to Sounds of Silence for Speech Denoising Introduction This is the repository of the "Listening to Sounds of Silence for Speech Denoising" p

Henry Xu 40 Dec 20, 2022
Multi-Content GAN for Few-Shot Font Style Transfer at CVPR 2018

MC-GAN in PyTorch This is the implementation of the Multi-Content GAN for Few-Shot Font Style Transfer. The code was written by Samaneh Azadi. If you

Samaneh Azadi 422 Dec 04, 2022
[AAAI2021] The source code for our paper 《Enhancing Unsupervised Video Representation Learning by Decoupling the Scene and the Motion》.

DSM The source code for paper Enhancing Unsupervised Video Representation Learning by Decoupling the Scene and the Motion Project Website; Datasets li

Jinpeng Wang 114 Oct 16, 2022
Hierarchical Uniform Manifold Approximation and Projection

HUMAP Hierarchical Manifold Approximation and Projection (HUMAP) is a technique based on UMAP for hierarchical non-linear dimensionality reduction. HU

Wilson EstĂ©cio MarcĂ­lio JĂșnior 160 Jan 06, 2023
i3DMM: Deep Implicit 3D Morphable Model of Human Heads

i3DMM: Deep Implicit 3D Morphable Model of Human Heads CVPR 2021 (Oral) Arxiv | Poject Page This project is the official implementation our work, i3DM

Tarun Yenamandra 60 Jan 03, 2023
A simple and lightweight genetic algorithm for optimization of any machine learning model

geneticml This package contains a simple and lightweight genetic algorithm for optimization of any machine learning model. Installation Use pip to ins

Allan Barcelos 8 Aug 10, 2022