Look At This Photograph

Getting Started with Image Processing in Javascript

In this tutorial I’m going to cover displaying an image in the browser. Simple enough, just use an image tag.

<img src="https://source.unsplash.com/user/cizikas/HfVsqRXkUmc" alt=”A Giraffe">

Tutorial over…

But wait. There is not much we can do with this image. We can’t apply custom filters to make Instagram jealous. Nor can we detect faces in order to draw fake mustaches on the picture. To do these things we need access to all the bits and bytes that make up an image. We’ll be using the HTML Canvas and a library called GPU.js to run our algorithms on the computer’s graphics processing unit, more commonly known as the GPU.

Why the GPU?

Images are composed of pixels, and often a lot of them. Many image processing algorithms will run some process on every pixel to produce a new value for the pixel. We can of course run the process on each pixel one after the other and this could work fine for small images, but as the size of the image grows the processing time will grow too. This can be especially bad if we need the processing to happen in real time, such as on a video stream.

Instead of running a process on one pixel at a time, the process will be run simultaneously on many pixels, also known as in parallel. That’s what the GPU was built to do. Javascript however, was not designed to work this way. It is only capable of doing one thing at a time. The GPU needs to be told what to do in a different language, GLSL.

This is where GPU.js comes in. Instead of learning another programming language, it allows us to write regular (almost) javascript functions and it will compile them to GLSL. Let’s see a simple example.

We start off by importing and instantiating the GPU.js library.

The createKernel method takes a function that will be compiled to run on the GPU and an options object. The output option determines how many parallel threads will run and the dimensions of the return array. In this case we have a two-dimensional 5x4 kernel. The this.thread property is used to determine on which thread the function is executing. We’ll use it later for access the pixels of an image. The output of this kernel looks as follows…

[[[0, 0], [0, 1], [0, 2], [0, 3], [0, 4]],

[[1, 0], [1, 1], [1, 2], [1, 3], [1, 4]],

[[2, 0], [2, 1], [2, 2], [2, 3], [2, 4]],

[[3, 0], [3, 1], [3, 2], [3, 3], [3, 4]]]

Note: It is important to use the function() {} syntax rather than arrow syntax because we need access to the kernel's this context.

Displaying the image

Now with the understanding of how to run functions on the GPU we can start using it to process images. GPU.js makes this simple. An instance of HTMLImageElement can be passed directly to a kernel and GPU.js will know how to handle it, turning it into a two-dimensional array of pixels. A pixel contains four channels of data — red, green, blue, and alpha — with each channel represented as number between 0 and 1 inclusive. A value of 0 means none of that color is in the channel and a value of 1 is the max amount of color. The alpha channel represents the transparency of the pixel. Using x and y coordinates we can access the pixels of the image. Let’s load an image and try it out.

const image = await loadImage('https://source.unsplash.com/random')

We’ll also want to create a canvas to display the image in.

const canvas = document.createElement('canvas')

canvas.width = image.width;

canvas.height = image.height;

canvas.style = 'max-width: 66vw; max-height: 66vh;'

document.body.appendChild(canvas)

Now create a new instance of GPU. We’ll give it the canvas we just created so it knows where to render.

The GPU needs to know it will be drawing to the canvas rather than just doing numerical computations, so we set the graphical option as true . We get the pixel value with the x and y coordinates of the kernel. Notice that the first coordinate of the image is the y-axis and the second is the x-axis. Also the coordinate y=0, x=0 is the bottom-left of the image and y=height-1, x=width-1 is the top-right. This may seem backwards depending on other drawing APIs you may be familiar with. The kernel’s color method takes the red, green, blue, and optionally alpha channels in the range 0 to 1 and sets that pixel at the this.thread.x, this.thread.y position. There are no changes made to the pixel data in this kernel — they are just taken from the image and drawn to the canvas.

Now run the kernel and there should be an image in the canvas.

kernel(image)

Photo by Aidas Ciziunas on Unsplash

Our GPU has produced a giraffe.

A more interesting example

Of course we wouldn’t go through the trouble of processing an image on the GPU just to display it as is. Here is another example designed to show how the channels combine to produce the original color. The image below has seven sections. The bottom-left shows only the red channel, the top-left shows only the green channel and the right column shows only the blue channel. The other sections show the overlap of the adjacent sections. So for example, the left-middle is showing the combination of the red and green channels. In the middle we have all three channels which combine to give us the original colors of the image.

And here is the kernel that generated the above image.