audioLIME: Listenable Explanations Using Source Separation

Overview

audioLIME

This repository contains the Python package audioLIME, a tool for creating listenable explanations for machine learning models in music information retrival (MIR). audioLIME is based on the method lime (local interpretable model-agnostic explanations) work presented in this paper and uses source separation estimates in order to create interpretable components.

Citing

If you use audioLIME in your work, please cite it:

@misc{haunschmid2020audiolime,
    title={{audioLIME: Listenable Explanations Using Source Separation}},
    author={Verena Haunschmid and Ethan Manilow and Gerhard Widmer},
    year={2020},
    eprint={2008.00582},
    archivePrefix={arXiv},
    primaryClass={cs.SD},
    howpublished={13th International Workshop on Machine Learning and Music}
}

Publications

audioLIME is introduced/used in the following publications:

  • Verena Haunschmid, Ethan Manilow and Gerhard Widmer, audioLIME: Listenable Explanations Using Source Separation

  • Verena Haunschmid, Ethan Manilow and Gerhard Widmer, Towards Musically Meaningful Explanations Using Source Separation

Installation

The audioLIME package is not on PyPI yet. For installing it, clone the git repo and install it using setup.py.

git clone https://github.com/CPJKU/audioLIME.git  # HTTPS
git clone [email protected]:CPJKU/audioLIME.git  # SSH
cd audioLIME
python setup.py install

To install a version for development purposes check out this article.

Tests

To test your installation, the following test are available:

python -m unittest tests.test_SpleeterFactorization

python -m unittest tests.test_DataProviders

Note on Requirements

To keep it lightweight, not all possible dependencies are contained in setup.py. Depending on the factorization you want to use, you might need different packages, e.g. nussl or spleeter.

Installation & Usage of spleeter

pip install spleeter==2.0.2

When you're using spleeter for the first time, it will download the used model in a directory pretrained_models. You can only change the location by setting an environment variable MODEL_PATH before spleeter is imported. There are different ways to set an environment variable, for example:

export MODEL_PATH=/share/home/verena/experiments/spleeter/pretrained_models/

Available Factorizations

Currently we have the following factorizations implemented:

  • SpleeterFactorization based on the source separation system spleeter (code)
  • SoundLIMEFactorization: time-frequency segmentation based on SoundLIME (the original implementation was not flexible enough for our experiments)

Usage Example

Here we demonstrate how we can explain the prediction of FCN (code, Choi 2016, Won 2020) using SpleeterFactorization.

For this to work you need to install the requirements found in the above mentioned repo of the tagger and spleeter:

pip install -r examples/requirements.txt
    data_provider = RawAudioProvider(audio_path)
    spleeter_factorization = SpleeterFactorization(data_provider,
                                                   n_temporal_segments=10,
                                                   composition_fn=None,
                                                   model_name='spleeter:5stems')

    explainer = lime_audio.LimeAudioExplainer(verbose=True, absolute_feature_sort=False)

    explanation = explainer.explain_instance(factorization=spleeter_factorization,
                                             predict_fn=predict_fn,
                                             top_labels=1,
                                             num_samples=16384,
                                             batch_size=32
                                             )

For the details on setting everything up, see example_using_spleeter_fcn.

Listen to the input and explanation.

TODOs

  • upload to pypi.org (to allow installation via pip)
  • usage example for SoundLIMEFactorization
  • tutorial in form of a Jupyter Notebook
  • more documentation
You might also like...
Offical implementation for
Offical implementation for "Trash or Treasure? An Interactive Dual-Stream Strategy for Single Image Reflection Separation".

Trash or Treasure? An Interactive Dual-Stream Strategy for Single Image Reflection Separation (NeurIPS 2021) by Qiming Hu, Xiaojie Guo. Dependencies P

PaddleRobotics is an open-source algorithm library for robots based on Paddle, including open-source parts such as human-robot interaction, complex motion control, environment perception, SLAM positioning, and navigation.

简体中文 | English PaddleRobotics paddleRobotics是基于paddle的机器人开源算法库集,包括人机交互、复杂运动控制、环境感知、slam定位导航等开源算法部分。 人机交互 主动多模交互技术TFVT-HRI 主动多模交互技术是通过视觉、语音、触摸传感器等输入机器人

Source-to-Source Debuggable Derivatives in Pure Python
Source-to-Source Debuggable Derivatives in Pure Python

Tangent Tangent is a new, free, and open-source Python library for automatic differentiation. Existing libraries implement automatic differentiation b

Empirical Study of Transformers for Source Code & A Simple Approach for Handling Out-of-Vocabulary Identifiers in Deep Learning for Source Code

Transformers for variable misuse, function naming and code completion tasks The official PyTorch implementation of: Empirical Study of Transformers fo

This source code is implemented using keras library based on "Automatic ocular artifacts removal in EEG using deep learning"

CSP_Deep_EEG This source code is implemented using keras library based on "Automatic ocular artifacts removal in EEG using deep learning" {https://www

An open source machine learning library for performing regression tasks using RVM technique.

Introduction neonrvm is an open source machine learning library for performing regression tasks using RVM technique. It is written in C programming la

This repository is an open-source implementation of the ICRA 2021 paper: Locus: LiDAR-based Place Recognition using Spatiotemporal Higher-Order Pooling.
This repository is an open-source implementation of the ICRA 2021 paper: Locus: LiDAR-based Place Recognition using Spatiotemporal Higher-Order Pooling.

Locus This repository is an open-source implementation of the ICRA 2021 paper: Locus: LiDAR-based Place Recognition using Spatiotemporal Higher-Order

This repository contains the source code for the paper
This repository contains the source code for the paper "DONeRF: Towards Real-Time Rendering of Compact Neural Radiance Fields using Depth Oracle Networks",

DONeRF: Towards Real-Time Rendering of Compact Neural Radiance Fields using Depth Oracle Networks Project Page | Video | Presentation | Paper | Data L

Source Code For Template-Based Named Entity Recognition Using BART

Template-Based NER Source Code For Template-Based Named Entity Recognition Using BART Training Training train.py Inference inference.py Corpus ATIS (h

Releases(v0.0.3)
Owner
Institute of Computational Perception
Johannes Kepler University
Institute of Computational Perception
Network Compression via Central Filter

Network Compression via Central Filter Environments The code has been tested in the following environments: Python 3.8 PyTorch 1.8.1 cuda 10.2 torchsu

2 May 12, 2022
A system used to detect whether a person is wearing a medical mask or not.

Mask_Detection_System A system used to detect whether a person is wearing a medical mask or not. To open the program, please follow these steps: Make

Mohamed Emad 0 Nov 17, 2022
Learning to Prompt for Continual Learning

Learning to Prompt for Continual Learning (L2P) Official Jax Implementation L2P is a novel continual learning technique which learns to dynamically pr

Google Research 207 Jan 06, 2023
Norm-based Analysis of Transformer

Norm-based Analysis of Transformer Implementations for 2 papers introducing to analyze Transformers using vector norms: Kobayashi+'20 Attention is Not

Goro Kobayashi 52 Dec 05, 2022
Transformer Tracking (CVPR2021)

TransT - Transformer Tracking [CVPR2021] Official implementation of the TransT (CVPR2021) , including training code and trained models. We are revisin

chenxin 465 Jan 06, 2023
ISTR: End-to-End Instance Segmentation with Transformers (https://arxiv.org/abs/2105.00637)

This is the project page for the paper: ISTR: End-to-End Instance Segmentation via Transformers, Jie Hu, Liujuan Cao, Yao Lu, ShengChuan Zhang, Yan Wa

Jie Hu 182 Dec 19, 2022
All public open-source implementations of convnets benchmarks

convnet-benchmarks Easy benchmarking of all public open-source implementations of convnets. A summary is provided in the section below. Machine: 6-cor

Soumith Chintala 2.7k Dec 30, 2022
Run Effective Large Batch Contrastive Learning on Limited Memory GPU

Gradient Cache Gradient Cache is a simple technique for unlimitedly scaling contrastive learning batch far beyond GPU memory constraint. This means tr

Luyu Gao 198 Dec 29, 2022
This repo provides function call to track multi-objects in videos

Custom Object Tracking Introduction This repo provides function call to track multi-objects in videos with a given trained object detection model and

Jeff Lo 51 Nov 22, 2022
PED: DETR for Crowd Pedestrian Detection

PED: DETR for Crowd Pedestrian Detection Code for PED: DETR For (Crowd) Pedestrian Detection Paper PED: DETR for Crowd Pedestrian Detection Installati

36 Sep 13, 2022
😇A pyTorch implementation of the DeepMoji model: state-of-the-art deep learning model for analyzing sentiment, emotion, sarcasm etc

------ Update September 2018 ------ It's been a year since TorchMoji and DeepMoji were released. We're trying to understand how it's being used such t

Hugging Face 865 Dec 24, 2022
An essential implementation of BYOL in PyTorch + PyTorch Lightning

Essential BYOL A simple and complete implementation of Bootstrap your own latent: A new approach to self-supervised Learning in PyTorch + PyTorch Ligh

Enrico Fini 48 Sep 27, 2022
🏅 The Most Comprehensive List of Kaggle Solutions and Ideas 🏅

🏅 Collection of Kaggle Solutions and Ideas 🏅

Farid Rashidi 2.3k Jan 08, 2023
Vision-Language Pre-training for Image Captioning and Question Answering

VLP This repo hosts the source code for our AAAI2020 work Vision-Language Pre-training (VLP). We have released the pre-trained model on Conceptual Cap

Luowei Zhou 373 Jan 03, 2023
Tensorflow2.0 🍎🍊 is delicious, just eat it! 😋😋

How to eat TensorFlow2 in 30 days ? 🔥 🔥 Click here for Chinese Version(中文版) 《10天吃掉那只pyspark》 🚀 github项目地址: https://github.com/lyhue1991/eat_pyspark

lyhue1991 9.7k Jan 01, 2023
Time Series Forecasting with Temporal Fusion Transformer in Pytorch

Forecasting with the Temporal Fusion Transformer Multi-horizon forecasting often contains a complex mix of inputs – including static (i.e. time-invari

Nicolás Fornasari 6 Jan 24, 2022
The official TensorFlow implementation of the paper Action Transformer: A Self-Attention Model for Short-Time Pose-Based Human Action Recognition

Action Transformer A Self-Attention Model for Short-Time Human Action Recognition This repository contains the official TensorFlow implementation of t

PIC4SeRCentre 20 Jan 03, 2023
Feed forward VQGAN-CLIP model, where the goal is to eliminate the need for optimizing the latent space of VQGAN for each input prompt

Feed forward VQGAN-CLIP model, where the goal is to eliminate the need for optimizing the latent space of VQGAN for each input prompt. This is done by

Mehdi Cherti 135 Dec 30, 2022
PyTorch implementation of Tacotron speech synthesis model.

tacotron_pytorch PyTorch implementation of Tacotron speech synthesis model. Inspired from keithito/tacotron. Currently not as much good speech quality

Ryuichi Yamamoto 279 Dec 09, 2022
Bottleneck Transformers for Visual Recognition

Bottleneck Transformers for Visual Recognition Experiments Model Params (M) Acc (%) ResNet50 baseline (ref) 23.5M 93.62 BoTNet-50 18.8M 95.11% BoTNet-

Myeongjun Kim 236 Jan 03, 2023