Resizing (enlarging)

In part 3, we discussed interpolation methods, including nearest neighbor, triangle, bilinear, and bicubic interpolation.

interpolation - the insertion of an intermediate value or term into a series by estimating or calculating it from surrounding known values.

One major reason to care about interpolation methods is because they’re needed when resizing an image. We resize images every day when we two-finger zoom images on our phones or resize browser windows, so it needs to be fast yet suitable in terms of quality.

Let’s begin with image enlarging—take a look at this picturesque image of a beautiful sunset! Unfortunately, the camera was quite small so we have a 4x4 image and want to increase the resolution.

We have 16 pixels that each have a width and height. The value of each pixel is a point sampling at the center of each pixel. Essentially, the pixel value is the sensor reading at the black crosses below (much like our human vision).

Our goal is to increase the resolution of this 4x4 image, so lets aim for a 7x7 image. To do this, we must first map our 4x4 and 7x7 coordinate systems.

Mapping Coordinate Systems

In our 4x4 image, (0, 0) is the center of the top left pixel. So the top left of the image is really (-0.5, -0.5), and the bottom right is really (3.5, 3.5). It’s easier to see why this is the case, so here is a diagram:

In our desired 7x7 image, the same reasoning applies, and the top left pixel is still (-0.5, -0.5). The bottom right, however, is now (6.5, 6.5).

This mapping is linear and therefore must take the form: Y=mX+c

We can find this mapping by solving the system of linear equations that we form from the information we know.

We know that (-0.5, -0.5) in the 4x4 image maps to (-0.5, -0.5) in the 7x7 image, and similarly (3.5, 3.5) maps to (6.5, 6.5). We therefore get:

(A) -0.5m + c = -0.5

(B) 6.5m + c = 3.5

Rearranging equation (A) we get:

(A) c = 0.5m - 0.5

Subbing into (B) we get:

6.5m + 0.5m - 0.5 = 3.5

Which simplifies to find that:

m = 4/7

c = -3/14 (we get this by simply subbing m into either (A) or (B) above)

Finally, we find our linear mapping by subbing our values into Y=mX+c

Y = (4/7)X - (3/14)

We can now iterate over each point in the desired 7x7 image to map back to the old coordinates in the 4x4 image.

Example: (1, 3) in the new 7x7 image was (5/14, 1.5) in the old 4x4 image.

Interpolation

We have now mapped our new 7x7 co-ordinates to the old 4x4 image, but we aren’t done. The 4x4 coordinates on our original image are real numbers and therefore don’t have values (remember, the values are the sensor readings at the center of each pixel).

This is where we need interpolation. We interpolate the old pixel values to find the new pixel values at every 7x7 coordinate.

We’ll use bilinear interpolation (detailed in the part 3) for this example, and will find q1 and q2 before finding the final interpolated value (q) between q1 and q2.

The value we’re trying to interpolate lies vertically centered in relation to the old values, so q1 and q2 are exactly between the upper and lower old values. We can therefore calculate q1 as follows:

q1 = ( 0.5*0 + 0.5*241, 0.5*255 + 0.5*90, 0.5*255 + 0.5*36 )

q1 = (120.5, 172.5, 145.5)

Similarly:

q2 = (248, 172.5, 18)

This was simple, as we were always multiplying by 0.5. But this isn’t the case when finding q from q1 and q2.

q = (9/14)*q1 + (5/14)*q2

Note: We’re multiplying q1 by 9/14, which is on the opposite side of the square! If you imagine that the new point was right next to the upper left blue pixel, we would want the new value to be more blue. This weighting achieves this.

So using q1 and q2, we can calculate q:

q = (166, 172.5, 100)

We do this for all of the new 7x7 pixels to calculate all of the new values, which gives us our final result!

One problem is that the outer pixels just use outer pixels in their calculationm but you can use different padding methods to improve this.

Further Results

If we increase the resolution further, we get the following results:

You can see that there are some issues. Where there are hard edges, for example, they bleed. In our 7x7 image (bottom-left), you can see this bleeding clearly as the sun changes the color of the sky.

Bilinear interpolation famously creates star shapes as the resolution is increased. You can see this if you look at the yellow in the top-right image.

Bicubic interpolation is more complex and avoids this star pattern but has it’s own problems:

Changing the interpolation method doesn’t change the overall method, so to enlarge an image we do the following:

1) Map the original and desired coordinates.

2) Find all the original (real) coordinates for each new coordinate.

3) Use an interpolation method to find each of the new pixel values.

Resizing (shrinking)

Instead of increasing an image’s resolution, we may want to shrink a 448x448 image to make a 64x64 thumbnail, for example.

If we use the resizing method we just went through above (mapping coordinates followed by interpolation), the result will contain a lot of artifacting (look at the bike handlebar or the black dot at the top-left in the trees for example).

The top example here was created with nearest-neighbor interpolation and the bottom using bilinear interpolation.

These methods worked relatively well for enlarging, so why are they so bad for shrinking?

The Problem

In a nutshell, the larger pixels get colored with their centers, as it was originally a higher resolution image.

When we map our few new coordinates to the many original ones, they map to smaller central pixels. Let’s look at an example:

The top left purple pixel is completely filled with the color in the smaller central pixel (marked with a pink dot). This is the handlebar of the bike in the original image, and once resized, it looks like this:

This is clearly a problem, but let’s take a look at another example. Here’s a section of the trees:

The center old pixel is solely used to fill the new pixel values, and the middle -left pixel therefore becomes black.

When shrinking, the central pixel value represents complex detail with a single color (you can see this big black pixel in the trees at the top left of the image).

There must be a better way!

Spoiler: there is

The Solution

This 64x64 image is a much smoother result than before (same original 448x448 image):

We can do this by taking the average (mean) of all the old pixel values to decide what color the new pixel is. For example, the handlebar looks like this after applying this method:

How do we do this averaging?

Each new, larger pixel contains a 7x7 grid of original pixels (448/64 = 7). We therefore sum the values of each pixel in the 7x7 grid and divide by the number of pixels summed.

Continuing our example, shrinking from a 448x448 image to a 64x64 image would require you to use this formula to find the pixel value at each 64x64 pixel:

(sum of 7x7 patch pixel values)/49

For shrinking, we should use this averaging technique rather than interpolative resizing.

Interpolative resizing doesn’t usually work at the extremes, so shrinking to less than half the original size or doubling the image size is really not a good idea with interpolative resizing. Always use averaging in these cases to avoid star patterns when enlarging or staircases when shrinking.

This averaging can be thought of as a “weighted sum of a patch of pixel values”, otherwise known as a convolution in image processing.