Let’s for a moment dive into some math. I promise this will not be that hard. Recall that on a coordinate plane, a point is a pair of two values, (x, y). An example of a point is (1, 3). These numbers are called the coordinates. Now this point exists in two dimensions, because it defines a unique spot in the two-dimensional coordinate plane. But what if we wanted to define a point in space, as opposed to a point on a flat surface? We could use a 3-dimensional point, like (1, 3, 10). If we wanted to define a point in an arbitrary number of dimensions (don’t try to think about what this would physically look like!), we just need a list of numbers as long as the number of dimensions. We call this list a vector.

Now we can step aside to a different topic. Suppose I have a bunch of news articles and I want to describe them in a succinct, consistent way for someone who didn’t want to read them. I want to make sure that the way the articles are described makes them easy to compare, so a prospective reader of my digest can pick out which ones they want to look at in full. One way to approach this is to pick out a bunch of attributes of these articles and give each article a score from 1 to 10 for each attribute representing how well that attribute describes that article. For example: suppose my attributes are “is about cats”, “is about animals”, “is about cars”, and “length”. In this system, a 200 word article about cats up for adoption may get ratings like (10, 9, 1, 1), since it is very much about cats and animals, but has nothing to do with cars and is very short. Likewise, a 5000-word review of a new SUV which talks in one paragraph about some dog-friendly features may get a score more like (1, 3, 10, 8). Notice the similarity in the lists of numbers I’m using as scores with the vectors I described above.

Now we can describe the magical “algorithm”.

Define a lot of attributes, now called features, to describe articles. This number of features is n. Read lots of articles and give each one a ranking for each feature. Store these vectors somewhere — together they make up a set called the test data. Wait for a person to read some articles (these are called the training data), then take the features of those articles and fit a curve to them using a type of mathematical formula called a regression. A commonly used type is called logistic regression. There are other, somewhat more sophisticated formulae to use here, but people rarely pick one with consideration. Find the vector in the training data that is closest to the curve. I will skip the math here, but you can do this by minimizing the Euclidean distance, which is a technique you can learn in a high-school calculus textbook, or college level multivariate calculus for the absolute hardest cases. Once you’ve found this vector, recommend the corresponding article to the person. If they read the article, it becomes part of the training data.

This is the magic of Facebook, Twitter, and that company I keep getting ads for that “made an algorithm for wine.” Ultimately all of us are most likely having our echo chambers shaped by a variation of this mathematical process.