MIDI-DDSP: Detailed Control of Musical Performance via Hierarchical Modeling

Related tags

Audiomidi-ddsp
Overview
logo

MIDI-DDSP: Detailed Control of Musical Performance via Hierarchical Modeling

Demos | Blog Post | Colab Notebook | Paper | Hugging Face Spaces

MIDI-DDSP is a hierarchical audio generation model for synthesizing MIDI expanded from DDSP.

Links

Install MIDI-DDSP

You could install MIDI-DDSP via pip, which allows you to use the cool Command-line MIDI synthesis to synthesize your MIDI.

To install MIDI-DDSP via pip, simply run:

pip install midi-ddsp

Train MIDI-DDSP

To train MIDI-DDSP, please first install midi-ddsp and clone the MIDI-DDSP repository:

git clone https://github.com/magenta/midi-ddsp.git

For dataset, please download the tfrecord files for the URMP dataset in here to the data folder in your cloned repository using the following commands:

cd midi-ddsp # enter the project directory
mkdir ./data # create a data folder
gsutil cp gs://magentadata/datasets/urmp/urmp_20210324/* ./data # download tfrecords to directory

Please check here for how to install and use gsutil.

Finally, you can run the script train_midi_ddsp.sh to train the exact same model we used in the paper:

sh ./train_midi_ddsp.sh

The current codebase does not support training with arbitrary dataset, but we will hopefully update that in the near future.

Side note:

If one download the dataset to a different location, please change the data_dir parameter in train_midi_ddsp.sh.

The training of MIDI-DDSP takes approximately 18 hours on a single RTX 8000. The training code for now does not support multi-GPU training. We recommend using a GPU with more than 24G of memory when training Synthesis Generator in batch size of 16. For a GPU with less memory, please consider using a smaller batch size and change the batch size in train_midi_ddsp.sh.

Try to play with MIDI-DDSP yourself!

Please try out MIDI-DDSP in Colab notebooks!

In this notebook, you will try to use MIDI-DDSP to synthesis a monophonic MIDI file, adjust note expressions, make pitch bend by adjusting synthesis parameters, and synthesize quartet from Bach chorales.

We have trained MIDI-DDSP on the URMP dataset which support synthesizing 13 instruments: violin, viola, cello, double bass, flute, oboe, clarinet, saxophone, bassoon, trumpet, horn, trombone, tuba. You could find how to download and use our pre-trained model below:

Command-line MIDI synthesis

On can use the MIDI-DDSP as a command-line MIDI synthesizer just like FluidSynth.

To use command-line synthesis to synthesize a midi file, please first download the model weights by running:

midi_ddsp_download_model_weights

To synthesize a midi file simply run the following command:

midi_ddsp_synthesize --midi_path <path-to-midi>

For a starter, you can try to synthesize the example midi file in this repository:

midi_ddsp_synthesize --midi_path ./midi_example/ode_to_joy.mid

The command line also enables synthesize a folder of midi files. For more advance use (synthesize a folder, using FluidSynth for instruments not supported, etc.), please see synthesize_midi.py --help.

If you have a trouble downloading the model weights, please manually download from here, and specify the synthesis_generator_weight_path and expression_generator_weight_path by yourself when using the command line. You can also specify your other model weights if you want to use your own trained model.

Python Usage

After installing midi-ddsp, you could import midi-ddsp in python and synthesize MIDI in your code.

Minimal Example

Here is a simple example to use MIDI-DDSP to synthesize a midi file:

from midi_ddsp import synthesize_midi, load_pretrained_model

midi_file = 'ode_to_joy.mid'
# Load pre-trained model
synthesis_generator, expression_generator = load_pretrained_model()
# Synthesize MIDI
output = synthesize_midi(synthesis_generator, expression_generator, midi_file)
# The synthesized audio
synthesized_audio = output['mix_audio']

Advance Usage

Here is an advance example to synthesize the ode_to_joy.mid, change the note expression controls, and adjust the synthesis parameters:

import numpy as np
import tensorflow as tf
from midi_ddsp.utils.midi_synthesis_utils import synthesize_mono_midi, conditioning_df_to_audio
from midi_ddsp.utils.inference_utils import get_process_group
from midi_ddsp.midi_ddsp_synthesize import load_pretrained_model
from midi_ddsp.data_handling.instrument_name_utils import INST_NAME_TO_ID_DICT

# -----MIDI Synthesis-----
midi_file = 'ode_to_joy.mid'
# Load pre-trained model
synthesis_generator, expression_generator = load_pretrained_model()
# Synthesize with violin:
instrument_name = 'violin'
instrument_id = INST_NAME_TO_ID_DICT[instrument_name]
# Run model prediction
midi_audio, midi_control_params, midi_synth_params, conditioning_df = synthesize_mono_midi(synthesis_generator,
                                                                                           expression_generator,
                                                                                           midi_file, instrument_id,
                                                                                           output_dir=None)

synthesized_audio = midi_audio  # The synthesized audio

# -----Adjust note expression controls and re-synthesize-----

# Make all notes weak vibrato:
conditioning_df_changed = conditioning_df.copy()
note_vibrato = conditioning_df_changed['vibrato_extend'].value
conditioning_df_changed['vibrato_extend'] = np.ones_like(conditioning_df['vibrato_extend'].values) * 0.1
# Re-synthesize
midi_audio_changed, midi_control_params_changed, midi_synth_params_changed = conditioning_df_to_audio(
  synthesis_generator, conditioning_df_changed, tf.constant([instrument_id]))

synthesized_audio_changed = midi_audio_changed  # The synthesized audio

# There are 6 note expression controls in conditioning_df that you could change:
# 'amplitude_mean', 'amplitude_std', 'vibrato_extend', 'brightness', 'attack_level', 'amplitudes_max_pos'.
# Please refer to https://colab.research.google.com/github/magenta/midi-ddsp/blob/main/midi_ddsp/colab/MIDI_DDSP_Demo.ipynb#scrollTo=XfPPrdPu5sSy for the effect of each control. 

# -----Adjust synthesis parameters and re-synthesize-----

# The original synthesis parameters:
f0_ori = midi_synth_params['f0_hz']
amps_ori = midi_synth_params['amplitudes']
noise_ori = midi_synth_params['noise_magnitudes']
hd_ori = midi_synth_params['harmonic_distribution']

# TODO: make your change of the synthesis parameters here:
f0_changed = f0_ori
amps_changed = amps_ori
noise_changed = noise_ori
hd_changed = hd_ori

# Resynthesis the audio using DDSP
processor_group = get_process_group(midi_synth_params['amplitudes'].shape[1], use_angular_cumsum=True)
midi_audio_changed = processor_group({'amplitudes': amps_changed,
                                      'harmonic_distribution': hd_changed,
                                      'noise_magnitudes': noise_changed,
                                      'f0_hz': f0_changed, },
                                     verbose=False)
midi_audio_changed = synthesis_generator.reverb_module(midi_audio_changed, reverb_number=instrument_id, training=False)

synthesized_audio_changed = midi_audio_changed  # The synthesized audio
Comments
  • ImportError and AttributeError

    ImportError and AttributeError

    Hi! Very interesting work! Iโ€™m trying to run MIDI_DDSP_Demo.ipynb, and I encountered some errors.

    1. ImportError: cannot import name 'LD_RANGE' occurred in from ddsp.spectral_ops import F0_RANGE, LD_RANGE(see here). -> According to DDSP, I think 'DB_RANGE' is correct, not 'LD_RANGE' .

    2. AttributeError: module 'ddsp.spectral_ops' has no attribute 'amplitude_to_db' occurred where ddsp.spectral_ops.amplitude_to_db is used (see here). -> According to DDSP, I suppose it is 'ddsp.core', not 'ddsp.spectral_ops'.

    opened by MasayaKawamura 3
  • Question on the installation

    Question on the installation

    Hi,

    When I opened up a new colab notebook and tried to pip install the midi-ddsp package, it took over 40 minutes and the installation can not be completed. I didn't experience this until this week.

    I've tried pip install midi-ddsp and pip install git+https://github.com/magenta/midi-ddsp and both gave me the same results.

    It seemed like pip would spend a lot of time trying to find which version of etils was compatible.

    opened by tiianhk 2
  • Error when using

    Error when using "pip install midi-ddsp"

    Hello everyone,

    I'm trying to use midi-ddsp to synthesize a few .midi files. In order to achieve it, I create a virtual environment with:

    python3 -m venv .venv source .venv/bin/activate

    note: python version == 3.10.6

    After creating it I run: "pip install midi-ddsp" . Getting this error message:

    Collecting ddsp Using cached ddsp-1.9.0-py2.py3-none-any.whl (200 kB) Using cached ddsp-1.7.1-py2.py3-none-any.whl (199 kB) Using cached ddsp-1.7.0-py2.py3-none-any.whl (197 kB) Using cached ddsp-1.6.5-py2.py3-none-any.whl (194 kB) Using cached ddsp-1.6.3-py2.py3-none-any.whl (194 kB) Using cached ddsp-1.6.2-py2.py3-none-any.whl (194 kB) Using cached ddsp-1.6.0-py2.py3-none-any.whl (194 kB) Using cached ddsp-1.4.0-py2.py3-none-any.whl (192 kB) Using cached ddsp-1.3.1-py2.py3-none-any.whl (192 kB) Using cached ddsp-1.3.0-py2.py3-none-any.whl (183 kB) Using cached ddsp-1.2.0-py2.py3-none-any.whl (179 kB) Using cached ddsp-1.1.0-py2.py3-none-any.whl (175 kB) Using cached ddsp-1.0.1-py2.py3-none-any.whl (170 kB) Using cached ddsp-1.0.0-py2.py3-none-any.whl (168 kB) Using cached ddsp-0.14.0-py2.py3-none-any.whl (143 kB) Using cached ddsp-0.13.1-py2.py3-none-any.whl (129 kB) Using cached ddsp-0.13.0-py2.py3-none-any.whl (129 kB) Using cached ddsp-0.12.0-py2.py3-none-any.whl (127 kB) Using cached ddsp-0.10.0-py2.py3-none-any.whl (109 kB) Using cached ddsp-0.9.0-py2.py3-none-any.whl (109 kB) Using cached ddsp-0.8.0-py2.py3-none-any.whl (108 kB) Using cached ddsp-0.7.0-py2.py3-none-any.whl (107 kB) Using cached ddsp-0.5.1-py2.py3-none-any.whl (101 kB) Using cached ddsp-0.5.0-py2.py3-none-any.whl (101 kB) Using cached ddsp-0.4.0-py2.py3-none-any.whl (97 kB) Using cached ddsp-0.2.4-py2.py3-none-any.whl (89 kB) Using cached ddsp-0.2.3-py2.py3-none-any.whl (89 kB) Using cached ddsp-0.2.2-py2.py3-none-any.whl (89 kB) Using cached ddsp-0.2.0-py2.py3-none-any.whl (88 kB) Using cached ddsp-0.1.0-py3-none-any.whl (88 kB) Using cached ddsp-0.0.10-py3-none-any.whl (88 kB) Using cached ddsp-0.0.9-py3-none-any.whl (86 kB) Using cached ddsp-0.0.8-py3-none-any.whl (86 kB) Using cached ddsp-0.0.7-py3-none-any.whl (85 kB) Using cached ddsp-0.0.6-py2.py3-none-any.whl (91 kB) Using cached ddsp-0.0.5-py2.py3-none-any.whl (91 kB) Using cached ddsp-0.0.4-py2.py3-none-any.whl (83 kB) Using cached ddsp-0.0.3-py2.py3-none-any.whl (81 kB) Using cached ddsp-0.0.1-py2.py3-none-any.whl (75 kB) Using cached ddsp-0.0.0-py2.py3-none-any.whl (75 kB) INFO: pip is looking at multiple versions of midi-ddsp to determine which version is compatible with other requirements. This could take a while. Collecting midi-ddsp Using cached midi_ddsp-0.1.3-py3-none-any.whl (56 kB) Using cached midi_ddsp-0.1.1-py3-none-any.whl (56 kB) Using cached midi_ddsp-0.1.0-py3-none-any.whl (53 kB) ERROR: Cannot install midi-ddsp because these package versions have conflicting dependencies.

    The conflict is caused by: ddsp 3.4.4 depends on tensorflow ddsp 3.4.3 depends on tensorflow ddsp 3.4.1 depends on tensorflow ddsp 3.4.0 depends on tensorflow ddsp 3.3.6 depends on tensorflow ddsp 3.3.4 depends on tensorflow ddsp 3.3.2 depends on tensorflow ddsp 3.3.0 depends on tensorflow ddsp 3.2.1 depends on tensorflow ddsp 3.2.0 depends on tensorflow ddsp 3.1.0 depends on tensorflow ddsp 1.9.0 depends on tensorflow ddsp 1.7.1 depends on tensorflow ddsp 1.7.0 depends on tensorflow ddsp 1.6.5 depends on tensorflow ddsp 1.6.3 depends on tensorflow ddsp 1.6.2 depends on tensorflow ddsp 1.6.0 depends on tensorflow ddsp 1.4.0 depends on tensorflow ddsp 1.3.1 depends on tensorflow ddsp 1.3.0 depends on tensorflow ddsp 1.2.0 depends on tensorflow ddsp 1.1.0 depends on tensorflow ddsp 1.0.1 depends on tensorflow ddsp 1.0.0 depends on tensorflow ddsp 0.14.0 depends on tensorflow ddsp 0.13.1 depends on tensorflow ddsp 0.13.0 depends on tensorflow ddsp 0.12.0 depends on tensorflow ddsp 0.10.0 depends on tensorflow ddsp 0.9.0 depends on tensorflow ddsp 0.8.0 depends on tensorflow ddsp 0.7.0 depends on tensorflow ddsp 0.5.1 depends on tensorflow ddsp 0.5.0 depends on tensorflow ddsp 0.4.0 depends on tensorflow ddsp 0.2.4 depends on tensorflow ddsp 0.2.3 depends on tensorflow ddsp 0.2.2 depends on tensorflow ddsp 0.2.0 depends on tensorflow ddsp 0.1.0 depends on tensorflow ddsp 0.0.10 depends on tensorflow ddsp 0.0.9 depends on tensorflow ddsp 0.0.8 depends on tensorflow ddsp 0.0.7 depends on tensorflow ddsp 0.0.6 depends on tensorflow ddsp 0.0.5 depends on tensorflow ddsp 0.0.4 depends on tensorflow ddsp 0.0.3 depends on tensorflow ddsp 0.0.1 depends on tensorflow ddsp 0.0.0 depends on tensorflow>=2.1.0

    To fix this you could try to:

    1. loosen the range of package versions you've specified
    2. remove package versions to allow pip attempt to solve the dependency conflict

    ERROR: Resolution Impossible: for help visit https://pip.pypa.io/en/latest/topics/dependency-resolution/#dealing-with-dependency-conflicts

    ======== I am currently working from an M1 mac so I would be pleased if you could help me reaching a solution for this problem.

    Thank you in advance, Juan Carlos

    opened by JuanCarlosMartinezSevilla 2
  • Why is vibrato_rate not used?

    Why is vibrato_rate not used?

    In my opinion, vibrato_rate (peak frequency) is more plausible than vibrato_extend (peak amplitude) to represent pitch pulsating. Why is vibrato_rate not used?

    opened by bfs18 2
  • Docker image with all the necessary packages

    Docker image with all the necessary packages

    Hi again!

    I'm still trying to execute your code with the "pip install midi-ddsp".

    I'm using docker from a tensorflow/tensorflow:latest image and running the pip command. I get this error:

    "E tensorflow/stream_executor/cuda/cuda_blas.cc:2981] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered"

    I ask you guys in order to know if you work with docker... if I could be able to have access to the image you use so all versions are the same and to avoid these type of errors.

    Thank you, Best regards, Juan Carlos

    opened by JuanCarlosMartinezSevilla 0
  • How to distinguish whether it is cresc or decresc in fluctuation?

    How to distinguish whether it is cresc or decresc in fluctuation?

    Hi! I am interested in your project. Currently, I am doing similar stuff, but i have some questions. (1) I know that Fluctuation stands for how much does that note cresc or decresc. Peak means which position got the maximum energy. However, i am wondering how to know if that note is cresc or decresc. Let's assume our peak position is 0.4, and fluc is 0.3. Then, is it decrease 0.3 or increase 0.3? How to distinguish it? (2) I know that all the expressive control values are normalized between 0 and 1, but what's the unit measure in these expressive controls?

    Thanks for answering

    opened by pillow8781 2
  • What is ``input`` in the def call()?

    What is ``input`` in the def call()?

    Hi, I am looking inside the code. I've seen a lot of methods about def call(self, inputs) in your code, especially looking at this one.

      def call(self, inputs):
        synth_params = self.get_synth_params(inputs)
    

    However, I couldn't find out what's the calculation of inputs, there are some clues I've found. In those codes, inputs is respond to the data in get_fake_data_synthesis_generator, then what are the data and units you input to get_fake_data_synthesis_generator? Frames? Amplitude or anything else?

    Thanks!

    opened by Megan8821 4
  • How to solve the error when installing midi-ddsp?

    How to solve the error when installing midi-ddsp?

    Hi, I am new in training model, and I got some problems in here. Does anyone how to solve the error of error: subprocess-exited-with -error and error: metadata-generation-failed in the terminal?

    opened by Megan8821 4
  • tfrecord features

    tfrecord features

    Hello, I'm very interested in your amazing work. Took a deep look at the urmp tfrecord datasets, I found something a bit confusing for me. Could you be so kind to help?

    1. I checked some urmp_tfrecords data, I found that some of them contain the following features: {"audio", "f0_confidence", "f0_hz", "f0_time", "id", "instrument_id", "loudness_db", "note_active_frame_indices", "note_active_velocities", "note_offsets", "note_onsets", "orig_f0_hz", "orig_f0_time", "power_db", "recording_id", "sequence"}. However, some of them don't include {"orig_f0_hz", "orig_f0_time"} in their tfrecord data. Why is this so and does such an inconsistency influence the model training?
    2. I want to include piano music when I train my own model. To this end, I think I need to generate tfrecords that have the same content as the urmp ones you used in your model. I plan to use maestro dataset. Could you be so kind to indicate if there's a tfrecord data generation code that we can take as a reference? Like the one you used to generate tfrecords for the midi-ddsp model?
    3. What is the difference between "batched" and "unbatched" dataset?

    Thank you very much for your help in advance.

    opened by gladys0313 1
  • Can i test with custom data?

    Can i test with custom data?

    Hi, i am interested in this exciting project and i am trying to test this with our custom dataset and reproduce the format of original data. But there are some difficulties and questions below.

    1. Is there no way to use custom datasets at all?
    1. Is there any code to calculate elements of dataset below?
    • I want to know how to get "note_active_velocities", "note_active_frame_indices", "power_db", "note_onsets", "note_offsets" but there is no any code on repository.

    Thank you for reading!

    opened by 589hero 6
  • Question on Figure

    Question on Figure

    image

    quick question on this figure in the blog post: i know coconet is its own model that will generate subsequent melodies given the input midi file. however, should i decide to train midi ddsp, will the training of coconet also be a part of this? or should i expect a monophonic midi melody as input and the generated audio as output.

    thanks for all the help and this awesome project

    opened by theadamsabra 7
Releases(v0.2.5)
Owner
Magenta
An open source research project exploring the role of machine learning as a tool in the creative process.
Magenta
Scalable audio processing framework written in Python with a RESTful API

TimeSide : scalable audio processing framework and server written in Python TimeSide is a python framework enabling low and high level audio analysis,

Parisson 340 Jan 04, 2023
NovaMusic is a music sharing robot. Users can get music and music lyrics using inline queries.

A music sharing telegram robot using Redis database and Telebot python library using Redis database.

Hesam Norin 7 Oct 21, 2022
nicfit 425 Jan 01, 2023
Desktop music recognition application for windows

MusicRecognizer Music recognition application for windows You can choose from which of the devices the recording will be made. If you choose speakers,

Nikita Merzlyakov 28 Dec 13, 2022
OpenClubhouse - A third-part web application based on flask to play Clubhouse audio.

OpenClubhouse - A third-part web application based on flask to play Clubhouse audio.

1.1k Jan 05, 2023
A music player designed for a University Project.

A music player designed for a University Project. Very flexibe and easy to use, a real life working application with user friendly controls. Hope u enjoy!!

Aditya Johorey 1 Nov 19, 2021
XA Music Player - Telegram Music Bot

XA Music Player Requirements ๐Ÿ“ FFmpeg (Latest) NodeJS nodesource.com (NodeJS 17+) Python (3.10+) PyTgCalls (Lastest) MongoDB (3.12.1) 2nd Telegram Ac

RexAshh 3 Jun 30, 2022
A collection of python scripts for extracting and analyzing acoustics from audio files.

pyAcoustics A collection of python scripts for extracting and analyzing acoustics from audio files. Contents 1 Common Use Cases 2 Major revisions 3 Fe

Tim 74 Dec 26, 2022
Converting UGG files from Rode Wireless Go II transmitters (unsompressed recordings) to WAV format

Rode_WirelessGoII_UGG2wav Converting UGG files from Rode Wireless Go II transmitters (uncompressed recordings) to WAV format Story I backuped the .ugg

Jรกn Mazanec 31 Dec 22, 2022
DeepMusic is an easy to use Spotify like app to manage and listen to your favorites musics.

DeepMusic is an easy to use Spotify like app to manage and listen to your favorites musics. Technically, this project is an Android Client and its ent

Labrak Yanis 1 Jul 12, 2021
Sequencer: Deep LSTM for Image Classification

Sequencer: Deep LSTM for Image Classification Created by Yuki Tatsunami Masato Taki This repository contains implementation for Sequencer. Abstract In

Yuki Tatsunami 111 Dec 16, 2022
๐ŸŽต Python sound notifications made easy

chime Python sound notifications made easy. Table of contents Table of contents Motivation Installation Basic usage Theming IPython/Jupyter magic Exce

Max Halford 231 Jan 09, 2023
This bot can stream audio or video files and urls in telegram voice chats

Voice Chat Streamer This bot can stream audio or video files and urls in telegram voice chats :) ๐ŸŽฏ Follow me and star this repo for more telegram bot

WiskeyWorm 4 Oct 09, 2022
An audio guide for destroying oracles in Destiny's Vault of Glass raid

prophet An audio guide for destroying oracles in Destiny's Vault of Glass raid. This project allows you to make any encounter with oracles without hav

24 Sep 15, 2022
Guide & Examples to create deeplearning gstreamer plugins and use them in your pipeline

upai-gst-dl-plugins Guide & Examples to create deeplearning gstreamer plugins and use them in your pipeline Introduction Thanks to the work done by @j

UPAI.IO 11 Dec 11, 2022
Audio library for modelling loudness

Loudness Loudness is a C++ library with Python bindings for modelling perceived loudness. The library consists of processing modules which can be casc

Dominic Ward 33 Oct 02, 2022
Python module for handling audio metadata

Mutagen is a Python module to handle audio metadata. It supports ASF, FLAC, MP4, Monkey's Audio, MP3, Musepack, Ogg Opus, Ogg FLAC, Ogg Speex, Ogg The

Quod Libet 1.1k Dec 31, 2022
Inner ear models for Python

cochlea cochlea is a collection of inner ear models. All models are easily accessible as Python functions. They take sound signal as input and return

98 Jan 05, 2023
Python library for handling audio datasets.

AUDIOMATE Audiomate is a library for easy access to audio datasets. It provides the datastructures for accessing/loading different datasets in a gener

Matthias 121 Nov 27, 2022
Python interface to the WebRTC Voice Activity Detector

py-webrtcvad This is a python interface to the WebRTC Voice Activity Detector (VAD). It is compatible with Python 2 and Python 3. A VAD classifies a p

John Wiseman 1.5k Dec 22, 2022