Recommender systems are at the core of pretty much every online service we interact with. Social networking sites like Facebook, Twitter and Instagram recommend posts you might like, or people you might know. Video streaming services like YouTube and Netflix recommend videos, movies or TV shows you might like. Online shopping sites like Amazon recommend products you might want to buy.

Collaborative filtering is perhaps the most common machine learning technique used by recommender systems.

Collaborative filtering is a method of making predictions about the interests of a user by collecting preferences from many users. The underlying assumption is that if a person A has the same opinion as a person B on an issue, A is more likely to have B’s opinion on a different issue than that of a randomly chosen person. — Wikipedia

The librec Java library provides over 70 different algorithms for collaborative filtering. In this post however, we’ll implement a relatively new technique called neural collaborative filtering.

The MovieLens 100K Dataset

The MovieLens 100K dataset is a collection of movie ratings by 943 users on 1682 movies. There are 100,000 ratings in total, since not every user has seen and rated every movie. Here are some sample ratings from the dataset:

Every user is given a unique numeric ID (ranging from 1 to 943), and each movie is given a unique numeric ID too (ranging from 1 to 1682). User’s ratings for movies are integers ranging from 1 to 5, with 5 being the highest.

Our objective here is to build a model that can predict how a user would rate a movie they haven’t already seen, by looking at the movie ratings of other users with similar tastes.

System Setup

If you want to follow along as we build this model, a fully reproducible Jupyter notebook for this tutorial can be found hosted on Jovian:

You can clone this notebook, install the required Python libraries using conda, and start Jupyter by running the following commands on your terminal or command prompt:

pip install jovian --upgrade # Install the jovian library

jovian clone a1b40b04f5174a18bd05b17e3dffb0f0 # Download notebook

cd movielens-fastai # Enter the created directory

jovian install # Install the dependencies

conda activate movielens-fastai # Activate virtual environment

jupyter notebook # Start Jupyter

Make sure you have conda installed before running the above commands. You can also click on the “Run on Binder” button to start a Jupyter notebook server hosted on mybinder.org instantly.

Preparing the data

You can download the MovieLens 100K dataset from this link. Once downloaded, unzip and extract the data into a directory ml-100k next to the Jupyter notebook. As described in the README, the file u.data contains the list of ratings.

We begin the importing the required modules from Pandas and FastAI.

We can now read the data from the CSV file u.data into a Pandas data frame, and create a FastAI data bunch which:

Converts the Pandas data frame into tensors It splits the data into a training set and a validation set Creates data loaders to access the data in batches Checks if a GPU is available, and moves the data to the GPU

Neural collaborative filtering model

The model itself is quite simple. We represent each user u and each movie m by vector of a predefined length n . The rating for the movie m by the user u , as predicted by the model is simply the dot product of the two vectors.

Here’s a small subset of the users and movies, represented by randomly chosen vectors of length 5, and the predicted ratings:

Source: FastAI Lesson 4

Since the vectors are chosen randomly, it’s quite unlikely that the ratings predicted by the model match the actual ratings. Our objective, while training the model, is to gradually adjust the elements inside the user & movie vectors so that predicted ratings get closer to the actual ratings.

We can use the collab_learner method from fastai to create a neural collaborative filtering model.

The actual model created here contains 2 important enhancements on the simpler version described earlier:

First, apart from the vectors for users and movies, it also add bias terms to account for outliers, since some users tend to always rate movies very high or very low, and some movies tend to be universally acclaimed or disliked.

Second, it applies the Sigmoid activation function to the above output, and scales it so that the result always lies in the given y_range , which is 0 to 5.5 in this case.

Training the model

The learner uses the mean squared error loss function to evaluate the predictions of the model, and the Adam optimizer to adjust the parameters (vectors and biases) using gradient descent. Before we train the model, we use the learning rate finder to select a good learning for the optimizer.

Upon inspection of the graph, we can see that the decrease in loss starts to decrease when the learning rate is around 0.01 . We can choose this as our learning rate and train for 5 epochs, while annealing the learning rate using the 1-cycle policy, which leads to faster convergence.

In just 15 seconds (on a GTX 1080Ti GPU), the mean squared error has come down to around 0.82 , which is quite close to the state of the art (as compared with these benchmarks). And it only took us 8 lines of code to load the data and train this model!

Looking at some predictions

While it’s great to see the loss go down, let’s look at some actual predictions of the model.

Indeed, the predictions are quite close to the actual ratings. We can now use this model to predict how users would rate movies they haven’t seen, and recommend movies that have a high predicted rating.

Save and commit

As a final step, we can save and commit our work using the jovian library.

Jovian uploads the notebook to https://jvn.io, captures the Python environment and creates a sharable link for the notebook. You can use this link to share your work and let anyone reproduce it easily with the jovian clone command. Jovian also includes a powerful commenting interface, so you (and others) can discuss & comment on specific parts of your notebook.

Further Reading

In a future post, we’ll dive deeper and see how DataBunch and collab_learner are actually implemented, using PyTorch. We'll also explore how we can interpret the vectors and biases learned by the model, and see some interesting results.

In the meantime, following are some resources if you’d like to dive deeper into the topic: