A good merchant knows what their customers want, but in the world of online classifieds and e-commerce do we really know our customers? How can we tell who is serious and who is simply passing by?

The easy way is simply to talk to them, ask them what they want and try to help them, but online that’s sometimes easier said than done. Enter the world of recommender systems, a plethora of frameworks, statistical models and approaches all aimed at telling our users what they want without them noticing. By loose definition a recommender system is just that, taking a pool of users and a pool of items, throw in a little algorithm-X, add some salt and serve chilled, ok it is a little harder than that.

This is the first of three articles where we will discuss how heycar a Berlin-based start-up went from an MVP to a more sophisticated data-driven platform with smarter user engagement through the use of deep learning for recommender systems.

Starting with the basics

As is the case with most start-ups once the excitement of launching the product dies down and data is flowing into the data warehouse it’s time to start getting serious about some of the business’s bigger questions. “How do we drive conversion?”, “how do we optimise customer retention?”, “how do we make money?”, fine, only two of those can really be helped through the use of recommender systems, but that should be enough of a reason to dive in.

Being a good data-driven company we decided that our first approach should be to develop some sort of a “baseline” model. Fortunately, we were already making use of AWS hosted Elasticsearch for providing content to our users in a quick and reliable fashion. The quick and dirty baseline model was established by making use of a few hard parameters such as the condition of the vehicle amongst others and apply a small decay over time based on the date the listing was published. Just to be sure the baseline model was a suitable enough baseline from which to grow, we decided to also run a small A|B test with a comparable random sample.

Much to our relief, the baseline model performed marginally better than the random sample and we had also tested a fully functioning A|B testing methodology. The establishment of the baseline model is a key step towards making an efficient recommender system, without it what do you measure against?

Surprisingly very few people go from algorithm to production and generally, models are measured against sample sets to determine validity under closed conditions. Much like Jurassic Park (1993) taught us that scientists are good at making stuff in a lab, but really shouldn’t let them out in the wild, so are models placed in production with little to no idea of how to measure their success.

Before I continue to talk about Deep Learning, Neural Networks and just about as many buzzwords as I can think of, I would like to mention a few key fundamentals we use for our day to day Data Science tasks as a small team:

Always be hypothesis driven: For us we had an opportunity to leverage the data we have and our platform to create a meaningful recommender; Always have a baseline population to measure against; A|B testing is your best friend, seriously, do it Failure is good, an experiment never fails, the hypothesis is proven wrong

Implicit matrix what?

LightFm, not to be confused with Beirut’s number one Feel good radio station was the first stop in our journey towards a spectacular vehicle recommender system. Based on the popular matrix factorisation techniques or better known as matrix decomposition. Matrix factorisation follows the idea that when you factorise a matrix, i.e. to find multiple matrices such that when you multiply them you get the original matrix. Now you may be thinking this is a great approach to find underlying similarities between two or more entities, and you are absolutely correct, but what similarities do we compare?

Put simply, we have users visiting the website and expressing interest in our listings through interactions, but how do you solve the problem of not knowing who your users are, where they live, or any other “–redacted due to GDPR-” demographics? We thought back to our childhood experiences when we were told that “we are all different” and therefore unique, just like everyone else. Armed with this knowledge we set out to apply the same logic to our LightFM model, users although unique, can be grouped together by their implied similar taste for specific listings on the website.

Introducing Implicit matrix factorisation for collaborative filtering, in our case a great user-based collaborative filter whereby we take user A, let’s call her Susan and we can refer items to her based on the items other similar users “liked” implying positive interactions. Susan’s identity can be inferred based on what we know about her, which is simply the listings she has looked at.

This approach although it may seem a little naïve is actually quite effective and simple to implement. Armed with a basic first draft and not having to take into account the latent features belonging to the vehicles we had a suitable test candidate, but what about the latent features? What insights can we derive from vehicles that affect the way in which users interact with them?

Latent features

We soon learnt that different users are affected by the latent features belonging to vehicles, but there was a small catch: we needed an efficient way to generalise features for different users. Rather than engineer a time-consuming approach we decided to try and bring these features into our LightFM model as item metadata. This approach appeared to have worked in the beginning, but we soon discovered that the “optimal” hyperparameters left us with only a small portion of the inventory to recommend to users. It was at this point that we decided this would be a good time to experiment with Deep learning.

In part 2 we will delve into how we leveraged deep neural networks to recommend some of our vehicle listings to returning users.