Acoustic Noise Cancellation by Machine Learning

DIY Noise-Cancellation System prototype made with TensorFlow.

Image by TheDigitalArtist on Pixabay

In this post I describe how I built an active noise cancellation system by means of neural networks on my own. I’ve just got my first results which I am sharing, but the system looks like a ravel of scripts, binaries, wires, soundcard, microphone and headphones, so I am not going to publish any sources yet. May be later.

Pet project with Tensorflow, ALSA, C++ & SIMD, and Python

During the last year I’ve been building an Acoustic Noise Cancellation system based on an Artificial Neural Network. I did it in my spare time, so that’s why it took so long for a relatively small experiment. The system I’ve built is a proof-of-concept, it showed consistency of an idea of NN as a noise canceller.

I was impressed by recent achievements of ML in image processing like neural style transfer. And I thought: “What if we teach RNN with repeated audio noise patterns in order to suppress it by ANC?” I had implemented perceptron on a DSP for some radar stuff before, so I wasn’t afraid by necessity of implementing neural network in a real-time software. Although I haven’t deal with tensorflow and gpu before, I was eager to do something with them. So far it seems to me that that was interesting trip and at last I achieved some results and had a lot of developers fun.

Artificial Brain Denoising

Why do we need ML to suppress noise?

Modern ANC systems successfully suppresses stationary noises like one in an aircraft. Common approach incorporates adaptive filters like LMS or RLS. The main flaw of this methods is that they are just FIR-filters at their core, which coefficients are being constantly adapted, and it fits the noise only after a few thousand samples. It means that an adaptive filter starts re-adapting to every changes of the environmental noise, even if it heard that pattern several seconds before. The use of ML methods in ANC could improve its performance because neural networks are able to adapt to signals with very complex internal structure.

NN in ANC is used for predicting values of noise samples in a moment when the black noise should meet the noise in a position of the microphone.

I expect outdoor noises like a loud exhaust system or a dishwasher aren’t stationary in a broad sense but pretty repetitive and could be learnt by a NN relatively easily.

Test set-up

My pet-project works as classic noise cancelling headphones except the fact that the microphone is located inside an ear-cup. The test set-up looks plain and simple:

I didn’t locate the microphone on an outside (like all noise cancellation headphones do) because I wanted to solve the ANC problem in more broader case. This set-up lets me easily tune my code to a room or car environment because the structure in all that cases looks the same.

Software layout

The software consists of three processes:

C++ program for real-time signal processing,

for real-time signal processing, The Manager is a Python script with tensorflow to teach NN,

is a Python script with tensorflow to teach NN, And the Learner is another Python script to manage the whole process.

The manager script communicates with others by gprc.

DIY mic-preamp and my old and dirty audio-technica ATH-910PRO

The microphone preamp and headphones are connected to a sound card, which is attached via USB to a laptop with Linux Mint 17 on it.

The microphone measures combination of a noise with a black-noise. Then C++ program does following things:

reads samples, splits external noise and the counter-signal from the headphones, passes noise samples to the feed-forward pass of neural network to get black noise samples, playback black noise samples produced by NN to the headphones, transfer noise samples to the learner script via the manager script, gets new NN coefficients from the manager.

I playback noises by my mobile phone standing couple meters aside the headphones. That construction managed to suppress different noise patterns, even non-stationary signals like bunches of sinuses, turned on and off arbitrarily by hands.

Internal details

The input sound is being sampled at 48 kHz, 16 bits width. It is being down sampled by 8x times by the code and forwarded to the splitter.

The splitter is needed to isolate the external noise from the input samples, as if there was no signal playedback to the headphones.

The perceptron is implemented on my own in C++ with intrinsics of SIMD instructions in order to meet the requirement of real time processing. The perceptron consists of two hidden layers:

The input layer tanh[304 x 64] takes last 304 samples of previously isolated external noise.

Intermediate layer tanh[64 x 64].

Output layer is linear [64 x 1].

It is relatively small network and it is able to manage with the horizon of last 304 samples. It has simple implementation and it takes 30–60 seconds to be taught. I’m going to upgrade it to some recursive variant, I’ve already checked the ability of RNN model to adapt to some complex sounds in a Jupyter notebook. It looks promising, but takes way more time to converge. I hope I’ll write a post about this notebook later.

The output signal is upsampled by 8x times and is sent to the headphones.

At the same time the program is sending bunches of external noise samples to the python script which is constantly adapting the weights of perceptron to newly measured noise patterns and sending it back to the c++ program. I run this python script in a cloud in vm with GPU. BTW, I found Paperspace is the best for ML experiments! Here is my referral link.

Results

Here is some plots demonstrating acheived results. This data is a log from C++ program, there is no objective measurements for now. However I tried to wear this headphones on my head and it seems working just like the plots show.

Terms

Black noise is the signal that was sent to microphone.

Pure noise is the external noise isolated by the splitter.

Black noise transformed to mic’s input is residual part of signal isolated by the splitter.

samples_in is raw sound samples measured by the microphone (and downsampled by 8x times of course).

Here is on the plot below how the perceptron senses the input and starting operating.

Single sine-wave @ 880 Hz