Introduction

The number of research publications on deep learning-based recommendation systems has increased exponentially in the past recent years. In particular, the leading international conference on recommendation systems, RecSys, started to organize regular workshops on deep learning since 2016. For example, in the 2019 conference in Copenhagen a couple of weeks ago, there is a whole category of papers on deep learning, which promotes research and encourages applications of such methods.

In this post and those to follow, I will be walking through the creation and training of recommendation systems, as I am currently working on this topic for my Master Thesis. In Part 1, I provided a high-level overview of recommendation systems, how they are built, and how they can be used to improve businesses across industries. Part 2 provides a nice review of the ongoing research initiatives with regard to the strengths, weaknesses, and application scenarios of these models.

Why Deep Learning for Recommendation?

Here are the 4 key strengths of deep learning-based recommendation systems compared to that of traditional content-based and collaborative filtering approaches:

Deep learning can model the non-linear interactions in the data with non-linear activations such as ReLU, Sigmoid, Tanh… This property makes it possible to capture the complex and intricate user-item interaction patterns. Conventional methods such as matrix factorization and factorization machines are essentially linear models. This linear assumption, acting as the basis of many traditional recommenders, is oversimplified and will greatly limit their modeling expressiveness. It is well-established that neural networks are able to approximate any continuous function with arbitrary precision by varying the activation choices and combinations. This property makes it possible to deal with complex interaction patterns and precisely reflect the user’s preference.

Deep learning can efficiently learn the underlying explanatory factors and useful representations from input data. In general, a large amount of descriptive information about items and users is available in real-world applications. Making use of this information provides a way to advance our understanding of items and users, thus, resulting in a better recommender. As such, it is a natural choice to apply deep neural networks to representation learning in recommendation models. The advantages of using deep neural networks to assist representation learning are in two-folds: (1) it reduces the efforts in hand-craft feature design; and (2) it enables recommendation models to include heterogeneous content information such as text, images, audio, and even video.

Deep learning is powerful for sequential modeling tasks. In tasks such as machine translation, natural language understanding, speech recognition, etc., RNNs and CNNs play critical roles. They are widely applicable and flexible in mining sequential structure in data. Modeling sequential signals is an important topic for mining the temporal dynamics of user behavior and item evolution. For example, next-item/basket prediction and session-based recommendations are typical applications. As such, deep neural networks become a perfect fit for this sequential pattern mining task.

Deep learning possesses high flexibility. There are many popular deep learning frameworks nowadays, including TensorFlow, Keras, Caffe, MXnet, DeepLearning4j, PyTorch, Theano… These tools are developed in a modular way and have active community/professional support. The good modularization makes development and engineering a lot more efficient. For example, it is easy to combine different neural structures to formulate powerful hybrid models or replace one module with others. Thus, we could easily build hybrid and composite recommendation models to simultaneously capture different characteristics and factors.

To provide a bird-eye’s view of this field, I will classify the existing models based on the types of employed deep learning techniques.

1> Multi-Layer Perceptron Based Recommendation

MLP is a feed-forward neural network with multiple hidden layers between the input layer and the output layer. You can interpret MLP as a stacked layer of non-linear transformations, learning hierarchical feature representations. It is a concise but effective network that can approximate any measurable function to any desired degree of accuracy. As such, it is the basis of numerous advanced approaches and is widely used in many areas.