Highfalutin’ Convolutin’

Convolution Image

I was doing a code test recently where I was asked to write an html 5 gallery. It led me to reach a couple of conclusions:-

  • HTML5 video is a lovely idea but from a practical implementation standpoint is stupid and useless because of the ridiculous browser requirements.
  • The canvas 2d context API is woefully underpowered.

Today I will be ranting about and playing with the second idea. We’ll start with introducing the CanvasPixelArray. We’ll apply a grey filter to an image, then we’ll move on to writing a simple convolution method.

I am also using this as a test for the syntax highlighting I’ve added to my site courtesy of the wonderful Syntax Highlighter project by Alex Gorbatchev.

If you don’t have a browser with canvas capabilities, then i’m afraid this isn’t going to go well for you. Get a proper browser. Alternatively try one which isn’t evil.

Enough talk. On to the good stuff.

The general flow is this; Create a canvas. Grab the 2D context. Put something on the canvas. Grab the pixel data from the canvas. Do something funky. Put the pixel data back.

Ok, so that sounds great. Getting stuff onto the canvas is easy, you can draw some random stuff yourself with the Drawing API or just copy image data into it. I’ll do the latter. We’ll give the canvas an ID and load an image. Once the image is loaded, grab the canvas context using that ID and getContext(‘2d’) and throw some pixels on there with drawImage().

This is our test image. It’s just a random holiday snap from a few years ago.

Convolution Sample Image

First step is to add a canvas tag with the width and height set. Don’t apply the width and height by editing the css or attributes with javascript after creating it, the specification calls for this to scale the canvas to the new size, not expand or contract it. It basically works the same way as changing the image width after it is loaded. This standardisation makes sense, but isn’t particularly practical in my opinion.

One very important thing to note about retrieving the image data from a canvas is that it returns an ImageData object with three properties; width, height and data. The data property is a CanvasPixelArray. This is about as basic an array as you can get. It has a length property and is numerically indexed. That’s it. No join(), slice() or any of the nice stuff we expect of an array nowadays.

The CanvasPixelArray is a series of 8-bit unsigned integers. Each pixel is split into 4 array elements. Each element represents the R, G, B or A value of a pixel in that order. Every 4 elements is the start of the next pixel. so pixel(0,0) of an image is given by arr[0] = Red, arr[1] = Green, arr[2] = Blue and arr[3] = Alpha. When looping through the array, it’s better to do an increment of 4 rather than a single increment so you know that the next 4 values will describe just a single pixel.

You can’t create an ImageData object on your own without having reference to a canvas context object using context.createImageData(). I’ve written a little Javascript library called BitmapFitlers to do some simple image manipulation. It is available here. To do the basic setup you call BitmapFilters.setNewCanvas(canvas) passing in a reference to a canvas object. This gives the library everything it needs to start performing non-destructive pixel manipulation.

Important side note: If you plan on iterating through the CanvasPixelArray, make sure you reference it with its own variable, not via the ImageData object’s data property. It is ridiculous how much slower the latter is.

  // myImageData is an ImageData object
  var myData = myImageData.data;
  var i,len = myData.length;
  for(i = 0; i < len; ++i)
  {
    // do this
    myData[i] = //something nice
    // not this
    myImageData.data[i] = //something horrible
  }

Now we have the image data, let’s do some image manipulation. Iterating through and changing individual pixels is easy, so below is a greyscale filter, let’s move on.

But that’s not interesting. All we did was loop through the pixels individually and edit them. What if we want to relate pixels to other pixels? For that we have convolution. However, if there is one thing that javascript doesn’t want to do, it’s convolution. Unfortunately Javascript does not supply any way to apply convolution matrices to the image data. To that end I wrote a really simple version for the BitmapFilters library. It only works on 3×3 matrices, and it’s not optimized so it’s pretty slow. I expect that unless I come to require a convolution filter for work, I won’t bother optimizing it. I’d rather test a different method of convolution such as using a Discrete Fourier Transform.

The idea is to supply a matrix (or since JS doesn’t do matrices, a 2 dimensional array) to be applied to the image one pixel at a time. Each pixel is calculated by superimposing the matrix onto the image, centered around the currently active pixel. The value of the current pixel and those around it are multiplied by their equivalent in the matrix and added to new value of the pixel. Sounds ridiculously complicated. Let’s take an example. This 3×3 matrix will do nothing to the image.

[ 0, 0, 0 ]
[ 0, 1, 0 ]
[ 0, 0, 0 ]

The center value of the matrix always applies to the currently active pixel, So in this case

new pixel(x,y) = 1*pixel(x, y)
+ 0*pixel(x-1, y)
+ 0*pixel(x+1, y)
+ 0*pixel(x, y-1)
+ 0*pixel(x, y+1)
+ 0*pixel(x-1, y-1)
+ 0*pixel(x-1, y+1)
+ 0*pixel(x+1, y-1)
+ 0*pixel(x+1, y+1)

If the matrix had looked like this:-

[ 0, 0, 0 ]
[ 1, 0, 0 ]
[ 0, 0, 0 ]

Then the calculation would change to this:-

new pixel(x,y) = 0*pixel(x, y)
+ 1*pixel(x-1, y)
+ 0*pixel(x+1, y)
+ 0*pixel(x, y-1)
+ 0*pixel(x, y+1)
+ 0*pixel(x-1, y-1)
+ 0*pixel(x-1, y+1)
+ 0*pixel(x+1, y-1)
+ 0*pixel(x+1, y+1)

which means that the current pixel value is replaced by the value of the pixel to its left, effectively shifting the whole image to the right by one pixel.

This is the function that does it, ugly as it is. I make no claims over its completeness or efficiency. The number of conditionals is appalling. I wish i had time to make it nice but I don’t.

BitmapFilters.applyConvolution = function(orig,filter,divisor){
		var i,j,offset,od,cd,len,flen,px,py,newImgData;
		//create a new ImageData object
		newImgData = context.createImageData(orig.width,orig.height);
		od = orig.data;
		cd = newImgData.data;
		len = cd.length;
		//because each pixel is split into 4 parts, the width is actually 4*context.width
		var rows = cWidth*4;
		var tmp,sum,m;
		//iterate through aaaaaaalll the pixel parts
		for(i = 0; i < len; i += 4)
		{
			for(m = 0; m < 3; ++m)//loop to do r g and b, but not a
			{
				sum = 0;
				if(i % rows !== 0) // if active pixel is left edge, don't look left.
				{
					tmp = filter[0][0]*od[i - rows - 4 + m];
					if(!isNaN(tmp)) sum += tmp;
					tmp = filter[1][0]*od[i - 4 + m];
					if(!isNaN(tmp)) sum += tmp;
					tmp = filter[2][0]*od[i + rows - 4 + m];
					if(!isNaN(tmp)) sum += tmp;
				}
				tmp = filter[0][1]*od[i - rows + m];
				if(!isNaN(tmp)) sum += tmp;
				tmp = filter[1][1]*od[i + m];
				if(!isNaN(tmp)) sum += tmp;
				tmp = filter[2][1]*od[i + rows + m];
				if(!isNaN(tmp)) sum += tmp;
				if((i+1) % rows !== 0) // if active pixel is right edge, don't look right.
				{
					tmp = filter[0][2]*od[i - rows + 4 + m];
					if(!isNaN(tmp)) sum += tmp;
					tmp = filter[1][2]*od[i + 4 + m];
					if(!isNaN(tmp)) sum += tmp;
					tmp = filter[2][2]*od[i + rows + 4 + m];
					if(!isNaN(tmp)) sum += tmp;
				}
				sum /= divisor;	
				cd[i+m] = sum;
			}
			cd[i+3] = 255;//set alpha to FF
		}
		return newImgData;

In Javascript the biggest problem when applying a matrix to image data is that there is no way to tell with the CanvasPixelArray where the left and right hand sides of the image are, or even when you are on the top row or the bottom row. This means that unless you do things properly, when you try to apply the matrix the right column calculations will include the pixels from the left column and vice-versa. If I had done a double loop (one for x, one for y) for the iteration through the pixels instead of a single loop, this would have just been a check for x < 0 and x > width, and the same for y. But I was trying to keep the number of loops down and this would mean doing another addition for every array access.

The example below takes the data from the grey canvas above and applies the following matrix to it. The matrix is a simple edge detection method.

[-1,-1,-1],
[-1, 8,-1],
[-1,-1,-1]

So basically each pixel is the outcome of subtracting the values of the 8 pixels around it from 8 times itself. Imagine what would happen if we picked a white block of 9 pixels and applied this matrix to the center pixel. You'd get 8*0xFFFFFF + 8*(-1 * 0xFFFFFF). This would be 0x000000, or black, indicating that there is no difference in these pixels. However if we imagine a white pixel surrounded by 8 blue pixels, which would be
8*0xFFFFFF + 8*(-1 * 0x0000FF) = 0x08FFFFFF - 0x0008FF = 0x08FFF700. Note that the top 8 bits in the example are not the alpha of the pixel, it's just hex maths. so the new colour value of the pixel is 0x08FFF700, which is greater than 0x00FFFFFF, therefore the pixel will be 0xFFFFFF, or white, indicating that the pixel is very different from its surroundings. It's magic I tells you.

And that's that. Next up I might do some DFT stuff for a nicer implementation of some convolution pixel magic.