The Web Audio API has been evolving over the last couple of years and opens the web up to many possibilities with sound and music. It's still not perfect, different browsers behave in different ways, but we are getting there. I know it's not something everybody likes, and annoying, pointless audio can cause fits of rage in some users. We've all experienced that 'Which fucking tab is playing music?' moment before, and it seems like a remnant of the Flash era. But how is a Musician or Sound Designer meant to showcase their work in fun and interesting ways on the web?

In this article we are going to create an Audio Visualisation using the DOM and the Web Audio API. Because of some of the issues with Firefox not correctly handling CORS, and Safari seemingly reporting no signal (a value of 128) across the board when requesting byte data on an AnalyserNode, this demo is Chrome only.

Take a look at the demo.

The Audio component

First off we are going to create the audio component of our demo. We need to have a look at the possibilities of preloading or streaming files and processing the audio data as it plays.

Creating the Audio Context

The AudioContext is the main backbone of the Web Audio API, and an interface that handles the creation and processing of individual audio nodes. First things first, we will initialise an over-arching AudioContext for our build. We will later use this context to create both buffer source and script processor audio nodes which we will 'wire' together.

var context; try { context = new AudioContext(); } catch (e) { throw new Error ( 'The Web Audio API is unavailable' ); }

Preloading MP3 over XHR

Since the second iteration of XMLHttpRequest we have been able to do some rather funky things with fetching data from the server. In this instance we are going to request an .mp3 audio file as an ArrayBuffer , which makes it infinitely easier to interact with the Web Audio API.

var xhr = new XMLHttpRequest(); xhr.open( 'GET' , '/path/to/audio.mp3' , true ); xhr.responseType = 'arraybuffer' ; xhr.onload = function () { }; xhr.send();

In the XHR's onload handler, the array buffer of the file will be available in the response property, not the usual responseText . Now that we have that array buffer we can continue and create a buffer source on the audio context. First we will need to use the audio context's decodeAudioData asynchronous method to convert our ArrayBuffer into an AudioBuffer .

var sound; xhr.onload = function () { sound = context.createBufferSource(); context.decodeAudioData(xhr.response, function (buffer) { sound.buffer = buffer; sound.connect(context.destination); }); };

At this point, we now have an AudioBufferSourceNode with our array buffer. By simply calling sound.start() at the end of the decodeAudioData callback, the sound should play.

This method of preloading files over XHR is all very well and good for small files, but perhaps we don't want the user to have to wait until the whole file is downloaded before we start playing. Which leads us into a slightly different method which allows us to use the streaming capabilities of the HTMLMediaElement.

Streaming with the HTML Media Element

For streaming we can use an <audio> element instantiated from JavaScript. Using the createMediaElementSource method, we can connect our audio element directly into our context whilst still retaining the HTMLMediaElement API methods such as play() and pause() . Rather than waiting for our file to be fully available using the canplaythrough event, we listen to the canplay event to find out as soon as enough data is downloaded for the file to be played for at least a few frames.

var sound, audio = new Audio(); audio.addEventListener( 'canplay' , function () { sound = context.createMediaElementSource(audio); sound.connect(context.destination); audio.play(); }); audio.src = '/path/to/audio.mp3' ;

This method reduces a lot of code and is more suitable for our demo's implementation, so let's clean this whole thing up with some promises and a bit of a Class definition of a Sound.

var audio, context; try { context = new AudioContext(); } catch (e) { throw new Error ( 'The Web Audio API is unavailable' ); } var Sound = { element: undefined , play: function () { var sound = context.createMediaElementSource( this .element); this .element.onended = function () { sound.disconnect(); sound = null ; } sound.connect(context.destination); this .element.play(); } }; function loadAudioElement (url) { return new Promise( function (resolve, reject) { var audio = new Audio(); audio.addEventListener( 'canplay' , function () { resolve(audio); }); audio.addEventListener( 'error' , reject); audio.src = url; }); } loadAudioElement( '/path/to/audio.mp3' ).then( function (elem) { audio = Object .create(Sound); audio.element = elem; audio.play(); }, function (elem) { throw elem.error; });

Now we have our file playing, let's go ahead and start attempting to get the frequency data from our audio.

Implement Audio Processing

To start listening to the live data from the audio context as our file plays, we need to wire up two separate audio nodes. These nodes can be defined from the start as soon as we have created the audio context. The first we create is a ScriptProcessorNode which is an interface that allows us to process the audio, the second, an AnalyserNode which provides us with real-time frequency and waveform/time domain analysis information.

var audio, context = new ( window .AudioContext || window .webAudioContext || window .webkitAudioContext)(), processor = context.createScriptProcessor( 1024 ), analyser = context.createAnalyser(); processor.connect(context.destination); analyser.connect(processor); var data = new Uint8Array (analyser.frequencyBinCount);

Now we have defined our analyser node and a data array, we have to make a slight change to our Sound class definition. Instead of wiring the audio's media element source only into our audio context, we now should wire through our analyser as well. We should also add an audioprocess handler to the processor node at this point and remove it when the file ends.

play: function () { var sound = context.createMediaElementSource( this .element); this .element.onended = function () { sound.disconnect(); sound = null ; processor.onaudioprocess = function () {}; } sound.connect(analyser); sound.connect(context.destination); processor.onaudioprocess = function () { analyser.getByteTimeDomainData(data); }; this .element.play(); }

This now means the audio nodes are wired up in the following way:

MediaElementSourceNode \=> AnalyserNode => ScriptProcessorNode /=> AudioContext \_____________________________________/

If you were simply to add a console.log(data) to the end of the audioprocess handler you would see a number of large arrays of integers fill up the console pretty quickly. This is exactly the data we are interested in.

To get the Frequency data instead, we would simply change the line in the audioprocess handler to:

analyser.getByteFrequencyData(data);

Frequency Data vs. Waveform/Time Domain Data

There are actually 4 different methods that the AnalyserNode gives us. Two of these methods relate to the Frequency Data and two to the Waveform or Time Domain Data. Each of these two sets of data can be copied from the analyser as either byte or float data. This means however that our typed array will have to be of the correct type relating to what we choose. In terms of byte data, as you have seen, we should provide an unsigned byte array, but with float data, we need to provide a Float32Array .

I have found that the Waveform/Time Domain data provides a much smoother output, where as the Frequency data provides a much more visual representation of the idiosyncrasies of the audio. It is worth experimenting with these outputs and seeing what ways you can transform data to suit your visual representation.

The visual component

Now we have all that audio guff out of the way, the fun and experimentation begins. There is so much potential for the use of that data to manipulate elements on the page, whether it be altering paths of an SVG or affecting the birth of new particles on a Canvas. However, for this demo, we are going to create the visualisation using DOM nodes and requestAnimationFrame . This means we have quite a lot of versatility with our output. For the benefit of performance we should only use compositable CSS properties such as transform and opacity .

The initial setup

First lets add an image to our document and setup some CSS. In the case of the Fourth of 5 logo, it is a transparent SVG and the circular background is created using a border-radius in CSS.

< div class = "logo-container" > < img class = "logo" src = "/path/to/image.svg" /> </ div >

.logo-container , .logo , .container , .clone { width : 300px ; height : 300px ; position : absolute ; top : 0 ; bottom : 0 ; left : 0 ; right : 0 ; margin : auto ; } .logo-container , .clone { background : black ; border-radius : 200px ; } .mask { overflow : hidden ; will-change : transform ; position : absolute ; transform : none ; top : 0 ; left : 0 ; }

Essentially, we are going to slice up the image into a number of columns or slices. We should define the number of slices we want as a constant and then look into cloning the element that number of times. Let's start setting up the JavaScript:

var NUM_OF_SLICES = 300 , STEP = Math .floor(data.length / NUM_OF_SLICES), NO_SIGNAL = 128 ; var logo = document .querySelector( '.logo-container' ); var slices = [] rect = logo.getBoundingClientRect(), width = rect.width, height = rect.height, widthPerSlice = width / NUM_OF_SLICES; var container = document .createElement( 'div' ); container.className = 'container' ; container.style.width = width + 'px' ; container.style.height = height + 'px' ;

Creating the 'slices'

For each 'slice' we want to create a mask of the original element with a width of our widthPerSlice and offset it on the x-axis based on it's index in the array.

You will notice that in the mask elements instance, we will use a 2-dimensional CSS matrix rather than the usual transform helper functions. I have found, particularly vector or DOM nodes, seem to blur as they scale, suggesting the browser caches them at a specific size and scales that cached version. To prevent this artefacting we manually define a matrix .

for ( var i = 0 ; i < NUM_OF_SLICES; i++) { var offset = i * widthPerSlice; var mask = document .createElement( 'div' ); mask.className = 'mask' ; mask.style.width = widthPerSlice + 'px' ; mask.style.transform = 'matrix(1,0,0,1,' + offset + '0)' ; var clone = logo.cloneNode( true ); clone.className = 'clone' ; clone.style.width = width + 'px' ; clone.style.transform = 'translate3d(' + -offset + 'px,0,0)' ; clone.style.height = mask.style.height = height + 'px' ; mask.appendChild(clone); container.appendChild(mask); slices.push({ offset: offset, elem: mask }); } document .body.replaceChild(container, logo);

We should now see... nothing different. What we have done is replaced our original element with 300 separate 'slices' that should line up to create the original element again. It's easier to see this under the hood, look at the DOM tree to see our 300 .mask elements. These are the elements which we are going to affect with data.

Defining our render function

Although our audioprocess handler is receiving data very quickly, we dont want to overwhelm the browser with too many composition changes. So we hold on to the data until the browser reports its availability for another paint. That's where using requestAnimationFrame comes in.

function render () { requestAnimationFrame(render); for ( var i = 0 , n = 0 ; i < NUM_OF_SLICES; i++, n+=STEP) { var slice = slices[i], elem = slice.elem, offset = slice.offset; var val = Math .abs(data[n]) / NO_SIGNAL; elem.style.transform = 'matrix(1,0,0,' + val + ',' + offset + ',0)' ; elem.style.opacity = val; } } render();

That's it! You should have a fancy jiggling DOM construct! It's now up to you to start experimenting with what you have.

In terms of performance, open dev tools and under Rendering options turn on 'Show paint rectangles' and 'Show composited layer borders'. Because we have used only compositable CSS properties, you can see that after an initial flash of green from the paint rectangles, there should be no more.

At this point, we have enough code to start thinking about pulling individual components of this into separate module definitions using something like browserify. However, I will leave that up to you.