Let’s start with the interpolant. The original Perlin noise algorithm used a cubic Hermite spline of the form s(t) = 3t2 − 2t3. This particular function is also sometimes known as smoothstep. It describes an s-shape, ramping smoothly up from 0 to 1 over the range of 0 to 1. It’s also symmetrical around the center of this square; that is, s(t) = 1 − s(1 − t).

So let’s flip it around, 1 − s(t), so that it’s 1 at 0 and falls off to 0 at 1. Then use the absolute value, 1 − s(|t|), so that it’s symmetric and falls off smoothly to 0 at -1 too. Call this new variation f(t):

f(t) = 1 − (3 − 2|t|)t2

Next we’ll extend that to two dimensions by applying it separately to x and y and then taking the product. (I focus exclusively on 2D noise in this post, but it’s trivial to extend all this to higher dimensions.) The figure below shows how this looks. In this and all the next figures, I’ve used white for values of 1 and above, black for -1 and below, and grey for 0:

× = f(x) f(y) f(x) f(y)

There are a couple of things to note here. First, it obviously evaluates to 1 at the origin. But this falls off fairly smoothly to exactly 0 at the boundaries of the -1 to 1 square. The central lobe isn’t circular by any means, but it’s a reasonable approximation here.

Next, let’s multiply that falloff kernel by a gradient. This is the gradient that will make this proper gradient noise and not just some other kind of noise function. The gradient here is computed by taking the dot product of an arbitrary vector, G, with each (x,y) coordinate as a vector offset from the origin. (If you’re not familiar with the vector dot product, you multiply the x-values from the two vectors, then the y-values, etc. and then sum.) The figure below shows the falloff, the gradient and their product. Try clicking or dragging to change vector G and see how it affects the result.

× = f(x) f(y) G ⋅ (x,y) f(x) f(y) (G ⋅ (x,y))

If you played around with that, you may have noticed some things. First, when looking at the gradient, there’s always a line for which it remains 0 (i.e., grey in these figures) and it always passes through the origin perpendicular to the vector G, regardless of the magnitude of vector. Rotating G around changes the direction of the gradient, while changing the length controls the sharpness of the gradient (longer vectors make sharper gradients, shorter vectors make softer gradients). Nonetheless, the 2×2 square here is always divided into equal halves, one positive and one negative.

When you factor in the falloff, there’s still a line of zeroes through the origin, perpendicular to G and dividing the square into positive and negative halves. But now it meets the borders of the square and those are also zeroes. The equal-but-opposite minima and maxima lie within the interior of each half of the square. Effectively, it’s a dipole. Note that the direction of G now exclusively determines the shape of the two regions (and the positions of the minima and maxima). The length of G makes no difference to the shape; it only determines the intensity.

So given this, we can just focus on the direction of G and always use unit length vectors. If we clamp the product of the falloff kernel and the gradient to 0 at all points beyond the 2×2 square, this gives us the surflet mentioned in that cryptic sentence.

Now that we’ve got that basic element, we can center one at each point on the rectangular integer grid (such that they’ll overlap like shingles) and then simply sum them all up. If we do that, we have our Perlin noise! The figure below shows a random patch of noise with the integer grid and the gradient vectors for the surflets overlayed. You can click or drag on it to change the vectors and see how it affects the noise.

Incidentally, this is why I’m not so fond of the reduced table of vectors in Perlin’s Improving Noise paper. If one naively takes a 2D slice through the 3D noise, along z = 0 for example, each gradient is only in one of the 8 basic directions and it’s biased towards the cardinal directions. This can produce ugly runs in the noise that go at 45° angles:

Instead of dealing with the complexity of gradients and surflets like this, one alternative is to just place random values between -1 and 1 on a grid and then upsample it using bilinear or bicubic interpolation. Some tutorials mistakenly call that Perlin noise, though it really isn’t; instead it’s considered “value noise”, as opposed to the “gradient noise” category that Perlin’s belongs to.

So why bother with gradient noise? Both value noise and gradient noise vary smoothly. The upper limit for how quickly they can change really depends on the density of the grid. As a result, they both have similar upper limits on the main frequencies they contain.

But value noise may give you runs where the values at the grid points are similar and the noise hardly changes (giving a “splotchy” appearance). In fact, a really unlucky case where all the values at the grid points are the same would contain only the zero frequency. Also, if you integrate a patch of value noise over some area (e.g., for downsampling it for mipmaps), it will approach 0 statistically but it’s highly unlikely to be exactly 0.

By contrast, for any given surflet, the line perpendicular to the gradient vector slices the surflet into equal but opposite halves that that are next to each other. No matter how the gradients lie, there will always be at least some wiggle to a gradient noise function, and again it will be dependent on the grid density. As a result, gradient noise has a lower limit on the main frequencies it contains; it is approximately bandlimited whereas value noise is not. Also, since the halves of each surflet are equal and opposite integrating over the whole surflet yields exactly 0. A periodic patch of gradient noise therefore also integrates to exactly 0.