hexagon.png

marcie1.png

> I = double(imread('marcie1.png'))/256; > I = (I(:,:,1)+I(:,:,2)+I(:,:,3))/3; > J = double(imread('hexagon.png'))/256; > h = size(I)(1); > w = size(I)(2);

J

circshift

> total = 0; > K = zeros(h, w); > for i = 1:h > i > for j = 1:w > if J(i, j) != 0 > K = K + J(i, j)*circshift(I, [i, j]); > total = total + J(i, j); > endif > endfor > endfor

total

> imshow(K/total) > pause(5)

[i, j]

J(i, j)

J

I

J

circshift(I,[0,1])

F

F(A+B) == F(A)+F(B)

F(s*A) == s*F(A)

s

A

B

F(circshift(I, [1,0])) == A .* F(I) F(circshift(I, [0,1])) == B .* F(I)

[i, j]

A.^i .* B.^j

.^

fft2

> r = rand(h, w); > A = fft2(circshift(r, [1, 0])) ./ fft2(r); > B = fft2(circshift(r, [0, 1])) ./ fft2(r);

> s = rand(h, w); > A0 = fft2(circshift(s, [1, 0])) ./ fft2(s); > B0 = fft2(circshift(s, [0, 1])) ./ fft2(s);

> abs(A-A0) > abs(B-B0)

> II = fft2(I);

> KK = zeros(h, w); > for i = 1:h > i > for j = 1:w > if J(i, j) != 0 > KK = KK + J(i, j).*A.^i.*B.^j.*II; > endif > endfor > endfor

fft2

ifft2

> K = ifft2(KK/total); > imshow(K) > pause(5)

> JJ = zeros(h, w); > for i = 1:h > i > for j = 1:w > if J(i, j) != 0 > JJ = JJ + J(i, j).*A.^i.*B.^j; > endif > endfor > endfor

> KK = JJ .* II; > K = ifft2(KK/total); > imshow(K) > pause(5)

fft2

ifft2

i,j

i,j

i-1,j

i,j

f

f

fft2

> II = fft2(I); > JJ = fft2(J); > KK = II .* JJ; > K = ifft2(KK/total); > imshow(K) > pause(5)

fft2

ifft2

fft2

fft2

fft2

There are many elementary introductions to the discrete Fourier transform on the web. But this one is going to be a little different. I hope to motivate and demonstrate an application of the Fourier transform starting from no knowledge of the subject at all. But apart from this sentence, I'll make no mention of complex numbers or trigonometric functions. That may seem weird - the standard definitions seem to make it clear that these concepts are crucial parts of the definition. But I claim that in some ways these definitions miss the point. An analogy from software engineering is appropriate: most definitions of the Fourier transform are a bit like explaining an interface to an API by reference to the implementation. That might tell you how it works at a nuts-and-bolts level, but that can obscure what the API is actually for.There's code all the way through so if anything I've said is ambiguous, that should help resolve it. I've chosen to write the following in 'literate Octave'. I can't say I love the language but it seems like the easiest way to express what I want here computationally.Suppose your camera has a hexagonal iris and it's out of focus. When you take a picture of a point light source (towards your lower left) you'll end up with a result like this:The effect is known as bokeh and the wikipedia page has a nice example with an octagonal iris.(If you want to use the Octave code in this article, save the image above as.)Suppose we take a picture of an ordinary scene instead. We can think of every visible point in the scene as a point light source and so the out-of-focus photograph will be the sum of many hexagons, one for each point in the image, with each hexagon's brightness determined by the brightness of the point it comes from. You can also flip this around and think about the out-of-focus image as being a sum of lots of copies of the original image, one of each point in the hexagon. We can go ahead and code this up directly in octave.Here's the picture that we'll apply bokeh to:(Background: I used to work for a division of Kodak and that was the test image we frequently used. Save the image as.)Let's read in our image and iris shape, converting the image to grayscale:Now we'll apply the bokeh by looping over the entire iris, accumulating shifted copies of the original image. We're optimising a little bit. As many of the pixels inare black we can skip over them. I'm usingso that we'll get wraparound at the edge. That turns out to be very convenient later.We useto scale the overall brightness back into a reasonable range:Make sure you understand what that code is doing because the rest of this article depends on it. The central line in the double loop is repeatedly adding copies of the original image, shifted by, and scaled byThere's just one problem with that code. It's incredibly slow. It's only tolerable because I know that most ofis zero so I could optimize it with a conditional. More general higher resolution images will leave you waiting for a long time.The image we have computed above is known as the convolution ofand. My goal is to show how we can use the Fourier transform to look at this in a very different way. As a side effect we will also get a much faster convolution algorithm - but the reason it runs faster is a story for another time. In this article just want to show what Fourier transforms are and why they're relevant at all.Central to the code above is the fact that we're shifting and adding the same image over and over again. If we could avoid that we might find a way to speed the code up. So let me define some shorthand. I'll use R to represent the function that shifts an image right one pixel. In Octave that's given by. I'll use U to mean a shift up by one pixel. Rather than define *the* Fourier transform I'm going to define a family of operations, all called Fourier transforms. (More precisely, these are two-dimensional discrete Fourier transforms.)(1) A Fourier transform is a linear operation that converts an image to another one of the same dimensions. This means that if you apply it to sums and multiples of images you get sums and multiples of the Fourier transforms. In the language of Octave, ifis such an operation, thenand, fora scalar.(2) A Fourier transform has this property: there is a pair of images,and, such that for any image I, F(U(I)) = AF(I) and F(R(I)) = BF(I). (AF(I) means the pixelwise product of A and F(I)). In Octave notation:Fourier transforms convert shifts to multiplications. (If you only learn one thing from this article, this should be it.)(3) Fourier transforms are invertible so there must be another linear transform Fsuch that F(F(I)) = I and F(F(I)) = I.Anything with these properties is an example of a Fourier transform. It's the second property that is crucially important. In jargon it's said to diagonalise translation.From (2) it follows that we can compute the Fourier transform of an image shifted byby multiplying the original Fourier transform by. (is the pixelwise power function.)If h is the image height, then shifting up h times should wrap us around to the beginning again. Similarly for w. So from (2) we know that A=1 and B=1. (That's shorthand for saying that each of the individual elements of A raised to the power of h give 1 and so on.)It just so happens that Octave comes supplied with a suitable function that satisfies the three conditions I listed. It's called. Let's find out what the corresponding images A and B are:(Note that ./ is the pixelwise division operator. If you're unlucky your random numbers will lead to a division by zero. Just try again.)Let's try again for another random image:We can see that A is almost exactly A0 and B is almost B0:So now we can go back to our original convolution algorithm. Let's defineAnd now we can use the first and third properties to compute the Fourier transform of the image with bokeh applied to it. We're applying properties (1) and (2) to the central line of the double loop above:I said that the definition of Fourier transform requires the existence of an inverse, and claimed thatwas a Fourier transform. So we must have an inverse. Octave conveniently provides us with the nameWe've eliminated all that shifting, but at the end of the day this code is slower. But did you notice something curious about the innermost line of the double loop? It's always adding the same multiple of II to KK. We can completely factor it out. So we can rewrite the code as:We can leave the multiplication by II all the way to the end:This is pretty cool. We can precompute JJ. Any time we want to convolve an image with the hexagon we applyto the image, multiply by JJ and then apply. But there's something even better going on. Let's look more closely at that double loop above involving powers of the elements of A and B. Let's write out the function it computes in standard mathematical notation:f(J) = ΣWhat is f(U(J))?f(U(J)) = ΣA(dy definition of shifting up)f(U(J)) = ΣA(by summing over i+1 instead of i and using wraparound)f(U(J)) = A f(J)Similarly,f(R(J)) = B f(J)In other words,satisfies the second property of Fourier transforms. It obviously satisfies the first property. Our transformlooks like a Fourier transform. In fact, it is, and Octave'sis defined this way. So now we have this procedure for convolving:We now have a fast convolution algorithm. Of course I've left out an important point:andare fast. They don't use the obvious summation algorithm suggested by my definition of f. But that's an implementation detail. We're able to reason successfully about important properties ofusing the properties I listed above.Let me recapitulate so you can see1. I defined Fourier transforms2. I showed how convolution could be rewritten using one3. I told you thatwas a Fourier transform, giving an alternative algorithm for convolution4. I showed that another part of the convolution algorithm looks like a Fourier transform. (I didn't prove it had an inverse.)5. I told you that (by amazing coincidence!) this other part is alsoI haven't shown you how to implement a Fourier transform. But I have shown you how you can reason about many of their properties. Enough to get much of the way to the convolution theorem Approaching Fourier transforms through the properties I listed is common in the more advanced mathematical literature. But for some reason, in the elementary literature people often choose to describe things in a more complicated way. This is true of many things in mathematics.I hope the above serves as useful motivation when you come to check out a more standard and more complete introduction to Fourier transforms. In particular, now's a good time to try to understand why the usual definition of the discrete Fourier transform satisfies the properties above and so fill in the steps I missed out.You may need to set up Octave for graphics. On MacOSX I started the X server and used "export GNUTERM=x11" before running Octave.