How Does a Computer Draw a Smooth Line?

Visualizing some of the intuition behind a statistical technique used to trace a line through data points

If I asked you to draw a smooth line between a bunch of points, you could probably do a pretty good job. And it’s also something journalists do all the time using computer graphics to illustrate trends in their data. But how do we get from that first example to the second — how does a computer replicate the intuition we exercise when tracing a line?

One such method is called kernel regression (or more specifically, Nadaraya-Watson kernel regression), which estimates a dependent variable y for an input x with the following equation:

If you’re the type of person to abandon an article once a mathematical equation pops up, I urge you to stick around, since this is actually simpler than it appears, and we’ll be done with the math after this next paragraph.

All the above equation says is that we can estimate any new point by taking a weighted average of the existing points. The x-xi term measures the distance between our new point and one of the old ones, and the K function (the kernel) assigns a weight based on that distance before multiplying it by the accompanying y-value. Add up all of those weighted points, and you get an estimate for a new point based on the values of all the old ones (the denominator ensures the weights will sum to 1).

This of course is a more explicit, technical approach to what we instinctually do while drawing a line through data: take close points into account while ignoring the rest for the most part, especially those farthest away.

If that still doesn’t make sense then…good! The goal of this post is get a visual grasp on how a kernel works, not a mathematical one. Our final deliverable will be a GIF that clearly illustrates a kernel in action as it draws a smooth line. From hereon we’ll use R to write some code, which you can follow if you’re familiar with the language or otherwise ignore.

First we need some fake data. We’ll count to 100 for our x-values, and in order to create a decently curvy response, our y-values will be the product of x squared and sine of x (scaled down to a single period). And, naturally, we’ll add some noise to shake the points off their underlying function.

Using the R package ksmooth, we can easily draw a smooth line for this data using the very kernel regression we just defined. The “normal” term specifies what type of kernel we use while bandwidth sets its sensitivity—the smaller the bandwidth, the more the smoothed line prioritizes extremely close points.

Now the kernel package automatically implements the earlier equation before we can see the intermediary steps — that’s the purpose of statistical software, after all — but we need the actual value of those weights for each point so that we can visualize their influence on the curve.

Since we’re using a Gaussian kernel, the weight of surrounding points is distributed normally, peaking at a distance of zero (that is, at the new point in question), which we can calculate using R’s dnorm function. At the risk of doing some classic mathematical hand-waving, I would ignore the “scale” line below—in short, it’s the way in which we convert the ksmooth bandwidth, defined by quartiles, into the appropriate normal distribution.

Now that we have a way of expressing the weight of every surrounding point that goes into the kernel’s estimate, we can visualize it using the size and color of points. More specifically, for any new point on the smooth line, the influence of the existing points is reflected by how large and red they are. For example, here is the point x = 50 :

Essentially, if we track the smooth line to where it lands over the x-value of 50, we get an average of the surrounding points with the various weights illustrated by their size and color. The nearby points exact great influence on the smoother, and are thus big and bright red; further points are all but negligible, and are therefore small and blue.

Let’s get fancy. By doing the same calculations and visualizations for every x-value between 1 and 100, we can generate a bunch of images and then stitch them together into a pretty little GIF. It’s as simple as executing the above code in a for-loop, and then uploading the individual images into some free online software to create the GIF:

if the GIF doesn’t load for some reason, it’s also here

This was a somewhat expedient examination of kernel regression. Surely we could unpack the equation more or play with different kernels and different bandwidths.

But our final GIF is nonetheless illustrative of the concept behind a smoothing kernel. Plus it kind of feels like were peeking into a computer’s brainwaves as it decides which points to consider as it draws a line. And in turn that process feels very familiar to anyone who’s done something like this the old-school, manual way.

Thanks for reading. Most of my other statistical thoughts can be found on my blog, perplex.city. The full code is here.