A two-stage U-Net for high-fidelity denoising of historical recordings

Last update: Jan 05, 2023

Overview

A two-stage U-Net for high-fidelity denoising of historical recordings

Official repository of the paper (not submitted yet):

E. Moliner and V. Välimäki,, "A two-stage U-Net for high-fidelity denosing of historical recordinds", in Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Singapore, May, 2022

Abstract

Enhancing the sound quality of historical music recordings is a long-standing problem. This paper presents a novel denoising method based on a fully-convolutional deep neural network. A two-stage U-Net model architecture is designed to model and suppress the degradations with high fidelity. The method processes the time-frequency representation of audio, and is trained using realistic noisy data to jointly remove hiss, clicks, thumps, and other common additive disturbances from old analog discs. The proposed model outperforms previous methods in both objective and subjective metrics. The results of a formal blind listening test show that the method can denoise real gramophone recordings with an excellent quality. This study shows the importance of realistic training data and the power of deep learning in audio restoration.

Listen to our audio samples

Requirements

You will need at least python 3.7 and CUDA 10.1 if you want to use GPU. See requirements.txt for the required package versions.

To install the environment through anaconda, follow the instructions:

conda env update -f environment.yml
conda activate historical_denoiser

Denoising Recordings

Run the following commands to clone the repository and install the pretrained weights of the two-stage U-Net model:

git clone https://github.com/eloimoliner/denoising-historical-recordings.git
cd denoising-historical-recordings
wget https://github.com/eloimoliner/denoising-historical-recordings/releases/download/v0.0/checkpoint.zip
unzip checkpoint.zip /experiments/trained_model/

If the environment is installed correctly, you can denoise an audio file by running:

bash inference.sh "file name"

A ".wav" file with the denoised version, as well as the residual noise and the original signal in "mono", will be generated in the same directory as the input file.

Training

TODO

Comments

Will it work in Windows without CUDA?

Hello, The readme says: "You will need at least python 3.7 and CUDA 10.1 if you want to use GPU."

Unfortunately, my first attempt to run it in Windows without CUDA-supporting VGA failed. There is really no separate environment file for CPU-only? Is it possible to make it work without massive changes to the code?

opened by vitacon 15
installation without conda

Hi,

could you leave some hints about how to install this without conda? Your readme appears to be very much specified to this one case. Also it seems that you develop under linux so you use bash to execute. Maybe here a hint for win- users would be cool too.

I am just trying to get this to run under windows and so far had no success. I will update if I get further. All the best!

opened by GitHubGeniusOverlord 9
strange tensorflow version in requirements.txt

Hi,

when running python -m pip install tensorflow==2.3.0 as indicated in your requirements file, I get

ERROR: Could not find a version that satisfies the requirement tensorflow==2.3.0 (from versions: 2.5.0rc0, 2.5.0rc1, 2.5.0rc2, 2.5.0rc3, 2.5.0, 2.5.1, 2.5.2, 2.6.0rc0, 2.6.0rc1, 2.6.0rc2, 2.6.0, 2.6.1, 2.6.2, 2.7.0rc0, 2.7.0rc1, 2.7.0, 2.8.0rc0) ERROR: No matching distribution found for tensorflow==2.3.0

It seems this version isn't even supported by pip anymore. Upgrade to 2.5.0?

The same is true for scipy==1.4.1. Not sure about which version to take there.

opened by GitHubGeniusOverlord 3
Update inference.sh

Small change to allow spaces in file names. Bash expands the variable $1 correctly even if it is in double quotes, python receives a single argument and not (if there are spaces) multiple arguments.

opened by JorenSix 1
How to start training for denoising?

If I would like to do a denoising task, where I've clean signals (in the "clean" folder) and noisy signals (in the "noise" folder).

opened by listener17 1

Releases(v0.0)

v0.0(Aug 31, 2021)

Uploading pretrained model
Source code(tar.gz)
Source code(zip)
checkpoint.zip(251.80 MB)

Owner

Eloi Moliner Juanpere

Doctoral candidate on audio signal processing at Aalto university.

GitHub Repository

A task Provided by A respective Artenal Ai and Ml based Company to complete it

A task Provided by A respective Alternal Ai and Ml based Company to complete it .

1 Jan 25, 2022

ADGAN - The Implementation of paper Controllable Person Image Synthesis with Attribute-Decomposed GAN

ADGAN - The Implementation of paper Controllable Person Image Synthesis with Attribute-Decomposed GAN CVPR 2020 (Oral); Pose and Appearance Attributes Transfer;

400 Dec 29, 2022

A python software that can help blind people find things like laptops, phones, etc the same way a guide dog guides a blind person in finding his way.

GuidEye A python software that can help blind people find things like laptops, phones, etc the same way a guide dog guides a blind person in finding h

0 Aug 09, 2022

A Pytorch Implementation of [Source data‐free domain adaptation of object detector through domain

A Pytorch Implementation of Source data‐free domain adaptation of object detector through domain‐specific perturbation Please follow Faster R-CNN and

1 Dec 25, 2021

This repo holds the code of TransFuse: Fusing Transformers and CNNs for Medical Image Segmentation

TransFuse This repo holds the code of TransFuse: Fusing Transformers and CNNs for Medical Image Segmentation Requirements Pytorch=1.6.0, 1.9.0 (=1.

93 Dec 19, 2022

yolov5目标检测模型的知识蒸馏（基于响应的蒸馏）

代码地址： https://github.com/Sharpiless/yolov5-knowledge-distillation 教师模型： python train.py --weights weights/yolov5m.pt \ --cfg models/yolov5m.ya

52 Dec 04, 2022

Histology images query (unsupervised)

110-1-NTU-DBME5028-Histology-images-query Final Project: Histology images query (unsupervised) Kaggle: https://www.kaggle.com/c/histology-images-query

1 Jan 05, 2022

A PyTorch implementation of the baseline method in Panoptic Narrative Grounding (ICCV 2021 Oral)

52 Dec 19, 2022

unofficial pytorch implementation of RefineGAN

RefineGAN unofficial pytorch implementation of RefineGAN (https://arxiv.org/abs/1709.00753) for CSMRI reconstruction, the official code using tensorpa

5 Jul 21, 2022

Code for the paper "Improving Vision-and-Language Navigation with Image-Text Pairs from the Web" (ECCV 2020)

Improving Vision-and-Language Navigation with Image-Text Pairs from the Web Arjun Majumdar, Ayush Shrivastava, Stefan Lee, Peter Anderson, Devi Parikh

44 Dec 14, 2022

FedMM: Saddle Point Optimization for Federated Adversarial Domain Adaptation

This repository contains the code accompanying the paper " FedMM: Saddle Point Optimization for Federated Adversarial Domain Adaptation" Paper link: R

20 Jun 29, 2022

Adaptive FNO transformer - official Pytorch implementation

Adaptive Fourier Neural Operators: Efficient Token Mixers for Transformers This repository contains PyTorch implementation of the Adaptive Fourier Neu

77 Dec 29, 2022

Tensor-based approaches for fMRI classification

tensor-fmri Using tensor-based approaches to classify fMRI data from StarPLUS. Citation If you use any code in this repository, please cite the follow

4 Sep 07, 2022

A quick recipe to learn all about Transformers

Transformers have accelerated the development of new techniques and models for natural language processing (NLP) tasks.

772 Dec 31, 2022

Supervised & unsupervised machine-learning techniques are applied to the database of weighted P4s which admit Calabi-Yau hypersurfaces.

Weighted Projective Spaces ML Description: The database of 5-vectors describing 4d weighted projective spaces which admit Calabi-Yau hypersurfaces are

3 Sep 08, 2022

A multi-functional library for full-stack Deep Learning. Simplifies Model Building, API development, and Model Deployment.

chitra What is chitra? chitra (चित्र) is a multi-functional library for full-stack Deep Learning. It simplifies Model Building, API development, and M

210 Dec 21, 2022

A two-stage U-Net for high-fidelity denoising of historical recordings

Related tags

Overview

A two-stage U-Net for high-fidelity denoising of historical recordings

Abstract

Requirements

Denoising Recordings

Training

Comments

Will it work in Windows without CUDA?

installation without conda

strange tensorflow version in requirements.txt

Update inference.sh

How to start training for denoising?

Releases(v0.0)

v0.0(Aug 31, 2021)

Owner

Eloi Moliner Juanpere

A task Provided by A respective Artenal Ai and Ml based Company to complete it

ADGAN - The Implementation of paper Controllable Person Image Synthesis with Attribute-Decomposed GAN

A python software that can help blind people find things like laptops, phones, etc the same way a guide dog guides a blind person in finding his way.

A Pytorch Implementation of [Source data‐free domain adaptation of object detector through domain

This repo holds the code of TransFuse: Fusing Transformers and CNNs for Medical Image Segmentation

yolov5目标检测模型的知识蒸馏（基于响应的蒸馏）

Histology images query (unsupervised)

A PyTorch implementation of the baseline method in Panoptic Narrative Grounding (ICCV 2021 Oral)

unofficial pytorch implementation of RefineGAN

Code for the paper "Improving Vision-and-Language Navigation with Image-Text Pairs from the Web" (ECCV 2020)

FedMM: Saddle Point Optimization for Federated Adversarial Domain Adaptation

Adaptive FNO transformer - official Pytorch implementation

Tensor-based approaches for fMRI classification

A quick recipe to learn all about Transformers

Supervised & unsupervised machine-learning techniques are applied to the database of weighted P4s which admit Calabi-Yau hypersurfaces.

A multi-functional library for full-stack Deep Learning. Simplifies Model Building, API development, and Model Deployment.

Classification of Long Sequential Data using Circular Dilated Convolutional Neural Networks

Gluon CV Toolkit

[NeurIPS 2020] Code for the paper "Balanced Meta-Softmax for Long-Tailed Visual Recognition"

Official implementation of "DSP: Dual Soft-Paste for Unsupervised Domain Adaptive Semantic Segmentation"